Abstract
Robotic systems frequently operate under changing dynamics, such as driving across varying terrain, encountering sensing and actuation faults, or navigating around humans with uncertain and changing intent. In order to operate effectively in these situations, robots must be capable of efficiently estimating these changes in order to adapt at the decision-making, planning, and control levels. Typical estimation approaches maintain a fixed set of candidate models at each time step; however, this can be computationally expensive if the number of models is large. In contrast, we propose a novel algorithm that employs an adaptive model set. We leverage the idea that the current model set must be expanded if its models no longer sufficiently explain the sensor measurements. By maintaining only a small subset of models at each time step, our algorithm improves on efficiency; at the same time, by choosing the appropriate models to keep, we avoid compromising on performance. We show that our algorithm exhibits higher efficiency in comparison to several baselines, when tested on simulated manipulation, driving, and human motion prediction tasks, as well as in hardware experiments on a 7 DOF manipulator.
Keywords: Probabilistic Inference, Motion and Path Planning, Human-Aware Motion Planning
I. Introduction
Whether operating on the road, on a deep space exploration mission to a distant world such as Europa, or in a household around people, robots frequently face changing dynamics. These changes arise for a variety of reasons, such as traversing changing terrains, faults induced by wear and tear from extreme operating conditions, or navigating around people with uncertain and changing intentions.
In these situations, we can typically obtain models of each dynamics mode via first-principles or data-driven approaches. Nevertheless, during operation, the robot will have uncertainty about which dynamics mode is currently occurring. In order to plan effectively under this uncertainty, we must estimate the dynamics based on sensor measurements. This estimation problem can be posed as a general filtering problem over the space of possible dynamics models. However, this set of models is typically large in practice. Furthermore, since the dynamics switch over time, an optimal estimator must track all possible mode sequences, the number of which grows geometrically in the time step. While there exist approximate estimators [1, 2], these typically still require large model sets at estimation time, which can be computationally expensive to manage.
In this work, we propose to use only a small subset of models for estimation at each time step. The fundamental challenge with such an approach, however, is that the best model may not be in this subset. To address this, our key idea is that the robot can detect when none of the current models sufficiently explain the sensor measurements, which in turn it can use as an indicator of when to expand the model set. Equipped with this idea, we propose a multiple model estimator with a novel mechanism for adapting the model set. Starting with a small subset of models, our algorithm only expands that set when the true observations are assigned low likelihood under all models in our current model set. To determine which model to add, we measure the predictive performance of each model currently not in the set and add the one from the set which assigns highest likelihood to the true observations. Further, to keep the subset as small as possible, we remove any models with low posterior probability which indicates that they are no longer needed to explain the measurements; should they become necessary again, we will detect this and add them in later on.
We experimentally evaluate the performance, with respect to efficiency and accuracy, and the robustness of our algorithm on a set of simulations on the following domains: trajectory tracking for a 3 DOF and a 6 DOF manipulator, which encounter actuation faults; trajectory tracking for a skid-steering vehicle driving across uncertain and changing terrains; and, trajectory planning for a Dubins’ car-like robot navigating around a human with changing intentions. Through these evaluations, we show that our adaptive estimation algorithm is computationally favorable when compared to non-adaptive estimators without sacrificing on estimation performance. Additionally, we found that our adaptive estimator was actually more accurate at predicting the true system mode in all experiments. We attribute this to an additional layer of filtering in our mechanism for adapting the model set, providing more stability in predicting the most likely mode.
II. Background and Related Work
In this work, we wish to estimate the state xk at time k, given the sequence of measurements observed, y0:k, and controls taken, u0:k. We assume the system evolves according to:
| (1) |
where mk ∈ {1, 2,…, N} is the mode of the system at time k. We assume access to a set of candidate models that could characterize the dynamics mode of the robot, obtained from first-principles or identified via data-driven approaches. We further assume that the state and mode are initially unknown. So, at any given time k, we must simultaneously estimate the state and mode of the system. Note that we do, however, assume full knowledge of the system dynamics in Eq. 1.
Consider the state estimator of xk, defined to be the expected state given the sequence of measurements taken up to time k. Then
| (2) |
Observe that such an estimator requires enumerating all possible mode sequences of length k + 1, or Nk+1 sequences, and thus typically cannot be implemented [3-5]. Even in the case where we make the simplifying assumption that we know the state xk, and only need to estimate mk, this geometric complexity remains.
Due to these issues, prior work has focused on developing approximate methods, which do not track all Nk+1 mode sequences to estimate the state and/or mode. We broadly divide such related work into fixed model set, online model learning, skill learning, and adaptive model set approaches.
Fixed model set.
There has been a wealth of research on estimation algorithms that maintain a fixed set of models (pre-defined or learned a priori) to simultaneously estimate the state and mode of the system. The Multiple Model Adaptive Estimation (MMAE) algorithm [6] keeps a bank of estimators, one per mode, each of which computes the mode conditioned estimate E[xk ∣ y0:k, mi,k]; traditionally, these are implemented as Kalman filters. To handle tracking of the mode sequence, the MMAE algorithm makes the simplifying assumption that the mode is fixed but unknown at k = 0, and therefore the estimator in Eq. 2 simplifies to:
| (3) |
While computationally convenient, such an approximation often does not perform well when the mode evolves over time, as in Eq. 1. To address this, the Interacting Multiple Model (IMM) algorithm [1, 2] adds an additional mixing step to the estimator update each time step, to account for the mode switches that may occur over time. While still an approximation, IMM estimation typically outperforms the MMAE algorithm, without considerable overhead [7]. Several other algorithms have been designed to handle such switches, a notable class of those being the generalized pseudo-Bayesian (GPB) estimators [8-10]. Finally, Cully et al. leverage an offline precomputation to enable the system to detect and compensate for failures (e.g. the loss of an actuator) [11]; while no new models are learned online, their system is still capable of adapting to novel situations at execution time. Our method is complementary to these approaches, as we introduce a mechanism for adapting the set of models, rather than keeping a fixed set for all time.
Online model learning.
In Event-Triggered Learning (ETL), the algorithm learns a new model when there is a mismatch between what the existing model predicts, and what is actually measured [12]. Harris et al. introduce a method for measuring the overall performance of a control system for the purposes of indicating that a chosen controller may be insufficient for the task at hand [13]. MOSAIC [14] simultaneously learns the set of all possible models, as well as how to select the subset relevant for the current environment in a single learning-based framework. In contrast to these approaches, our method selects from an existing set of models acquired a priori. These online learning approaches, however, are complementary: a combined algorithm could determine whether to choose from this existing set of models or identify a new model online.
Learning new skills.
Within robot skill learning, some prior work focuses on how to combine existing model sets with new model sets (e.g. of how to accomplish a task). Koert et al. manage a set of skills, represented by Gaussian Mixture Models, and introduce a mechanism for adding and removing skills from this set, analogous to our procedures from adding and removing models from a model set [15]. Similarly, Maeda et al. introduce a mechanism for deciding whether to rely on previously learned skills (represented as Gaussian Process motion primitives) to accomplish a task, or to signal learning of a new skill [16]. While these approaches employ a similar idea of managing a set of models via adding, removing, and updating mechanisms, they do not apply directly to the simultaneous mode and state estimation problem that we address in this paper.
Adaptive model set.
In some cases, such as when computational resources are constrained and the number of possible modes is high, it is desirable to adapt the model set over time. Estimators that choose a model set from an existing set of models are generally referred to as Variable Structure Multiple Model (VSMM) algorithms [17, 18]; our approach resides in this category. One approach is to frame the questions around how to adapt the model set as statistical hypothesis tests [4, 19]. However, we need to perform 2N of these, which is computationally prohibitive. To address this, certain methods leverage structure in the system, such as the Model-Group Switching (MGS) algorithm [20, 21]. In many robotics settings, however, we often cannot assume such structure (e.g. sensing and actuation faults are unpredictable). Our proposed approach, in contrast, scales linearly (rather than geometrically) with N, and does not assume any structure on the mode switching dynamics.
III. The Adaptive Model Set (AMS) Algorithm
A. Overview
Our proposed Adaptive Model Set (AMS) estimator applies the IMM algorithm with the addition of a model set update via Alg. 1 at each step. We wish to use only a small subset of all possible models at each time step in order to mitigate computational expense.
When to expand the model set.
Our key idea is to expand the current model set only if the current models are insufficient to explain the sensor measurements. We implement this in line 6, which checks a threshold on the measurement likelihood p(yk ∣ y0:k–1, mk = i) > β If none of the current models can explain the current measurement, then Alg. 1 expands the model set.
Majority voting.
Instead of deciding to expand the model set on the basis of a single measurement, our algorithm considers a majority vote over this decision across NV time steps (lines 6-10). For example, in Fig. 2, the algorithm expands the model set only after a majority of votes between time step k1 and k1 + NV agree that it is necessary to do so. In Fig. 2, grey indicates a vote against adding models, and red indicates a vote for adding models at that time step.
Fig. 2:
An example depicting the operation of our AMS estimator on the Kinova JACO 7 DOF manipulator system. As in an experiment in Sec. VII, the end-effector follows a position trajectory. The person immobilizes one of the robot’s joint at time step k1. By time step k1 + NV, the algorithm detects that the current model set is insufficient to explain the encoder measurements– so it expands the model set appropriately. Later, the algorithm removes the models that are no longer needed to explain the measurements. At time step k2, the person lets go of the robot, allowing the joint to move freely again.
If NV > 1, then the system may have switched modes at some point during the past NV time steps. To account for this, we enumerate all mode sequences of length NV, illustrated by the trees of mode sequences at time steps k1+NV and k2+NV in Fig. 2. Then, for each mode sequence M, we instantiate a filter to compute the state estimate at time step k conditioned on mode sequence M between time steps k – NV + 1 and k. To do so, we instantiate the filter with the state estimate at k – NV + 1 (line 14), and then for each time step up until k, we update the filter with the input and measurements that were received at that time step (lines 15-17). In order to perform these updates, we must store uk,yk, as well as some additional information about the estimates, for the last NV time steps (e.g. for a Kalman filter, we would store the previous state estimate and estimation error covariances).
While we enumerate all mode sequences of length NV, there exist several possible optimizations. For example, we could assume that the system switched modes only once during the voting period.
Which models to add.
Once it has constructed all candidate filters, Alg. 1 selects the best performing models, with respect to an evaluation function q. This evaluation function is specific to the type of filter that we are using, and provides a measure of how well the filter predicts the current observed measurements. For example, if we are using Kalman filters, then q(FM) is proportional to the filter’s residual (i.e. difference between predicted and observed measurement), where i is the final mode in the mode sequence M. Finally, Alg. 1 adds the best performing model (line 19), as well as all other filters FM whose performance is close to with respect to q, to the new filter set Mk (lines 20-22).
Removing models.
If a model has low a posteriori probability Pr(mk–1 = i∣ y0:k–1) (denoted as pi in the pseudocode) at the previous time step k – 1, then the algorithm removes it (lines 3-5); hence, models not needed to explain the data seen thus far are not kept in the model set. While prematurely removing a model temporarily degrades the estimator performance, if that model is important, it will simply be added again at a later time step.
| Algorithm 1 Updates the model set at time step k, given the latest measurement and control input. | |
|---|---|
B. Parameters
Threshold for removing models (α).
The parameter α (line 4) determines when to remove a model, based on the a posteriori probability of that particular model best representing the system mode. In our experiments, we use α = 0.01.
Threshold for expanding the model set (β).
The parameter β is used to determine when to expand the model set (line 6), and may in fact vary over time. For example, in our experiments, since we use Kalman filters that assume Gaussian measurement likelihood distributions, we set β to be the likelihood of a measurement that is two standard deviations away from the predicted measurement.
Threshold for adding models (γ).
The parameter γ determines how the algorithm “groups” similarly performing models (line 20). An alternative approach would add only the models i where q-value is above a threshold. In the case where none of the models adequately explain the measurements with respect to this threshold, no models would be added, causing poor performance. Therefore, it is better to add the best models we have relative to one another, rather than with respect to a fixed threshold. In practice, the value of γ depends on the scale of the evaluation function q, and should typically be small.
Number of votes (NV).
The parameter NV determines the number of votes before expanding the model set. This parameter captures a trade-off between robustness (explored experimentally in Sec. V), and computational efficiency, since the number of mode sequences is on the order of NNV.
IV. Performance Evaluation of AMS Estimator
We first evaluate if the AMS estimator enables high estimation accuracy, while decreasing the computational load. We evaluate our estimator on 2 simulated systems: a planar 3 DOF manipulator, and a 6 DOF manipulator, proposed for a future NASA lander mission to Europa [22] (see Fig. 3).
Fig. 3:
NASA 6 DOF Europa lander arm, tracking a position trajectory for the end-effector (the scoop). The actuator on the second joint suffers a temporary 75% degradation. Using the nominal model for controlling the manipulator leads to a trajectory that diverges from the reference, since controller is unable to compensate for the degradation.
Simulated Behavior.
Each experiment is 3.1 s long, with a time step of 0.01 s. The system starts in mode nominal, and switches to another modes after 0.75 s, and then switches back to nominal after 1.75 s.
Nominal:
Manipulator is operating normally. We apply a first-order Euler discretization in time to the manipulator equations (see [23]) to derive an equation of the form in Eq. (1). Locked joint: A single joint is completely immobile. Free-swinging joint: All input torque at the free-swinging joint is set to zero. Degraded actuator: The input torque at the degraded actuator is multiplied by a scalar degradation factor in the interval (0, 1). We consider degradation factors of 0.25, 0.50 and 0.75.
Each of these modes, other than the nominal mode, can occur at each joint. Therefore, to count the total number of modes, we multiply by the number of joints. So, in these experiments, we consider 17 possible modes for the 3 DOF arm, and 32 modes for the 6 DOF arm.
Control Objective.
We consider two control objectives: first is a jointspace tracking task, where the goal is for the manipulator to follow a time-varying, sinusoidal sequence of joint positions, velocities, and accelerations; the second is a taskspace tracking task, where the goal is for the end-effector to follow a time-varying sequence of positions and velocities in taskspace.
Control Law.
For both control objectives, we use a computed torque control (CTC) law [23]. We design the CTC input based on the model with highest a posteriori probability at each time step.
Independent Variables.
We manipulate whether we adapt the model set or keep it fixed; this leads to three non-adaptive baselines to compare with our adaptive estimator. Ground-Truth (GT): A Kalman filter using the ground-truth model at each time step. Nominal (N): A Kalman filter using the nominal internal model. Interacting Multiple Model (IMM): The IMM algorithm described in Sec. II. Note that the model set includes all possible models at each time step. Adaptive Model Set (AMS): Our proposed AMS estimator, described in Alg. 1. For the AMS algorithm, we also manipulate NV.
Dependent Measures.
Mode Prediction Accuracy: The percentage of time steps for which the estimator correctly predicts the system mode. Here, the estimator predicts the mode with maximum a posteriori probability. State Estimation Error: Defined for each time step to be the norm of the difference between the actual state and the estimated state. Position/Velocity Tracking Error: Defined as the norm of the difference between the desired position/velocity and the actual position/velocity. For the jointspace control objective, position/velocity is the joint angles/velocities. For the taskspace control objective, position/velocity is the end-effector position/velocity. Estimator Update Time: Defined for each time step as the amount of time (in seconds) needed to update the state estimate. All measures reported in the tables in this paper are mean values, with standard error reported in parenthesis.
Hypotheses.
H1: Adapting the model set, via our AMS algorithm, performs better with respect to computation time than a non-adaptive estimator. H2: Adapting the model set, via our AMS algorithm, performs at least as well as a non-adaptive estimator with respect to state estimation and trajectory tracking error.
Trials.
For each algorithm, we conduct 50 trials, each with a different random seed, every combination of manipulator, true mode the system switches to, and control objective.
Analysis.
For the 3 DOF and 6 DOF systems, our AMS estimator is faster than the IMM estimator on both control objectives, supporting H1. Fig. 4 shows the update times for the 6 DOF experiments, and the 3 DOF experiments follow the same trend. Our estimator also performs similar or better than the IMM estimator with respect to estimation and tracking errors on both control objectives, shown in Table I, in support of H2. We note that AMS outperforms IMM in mode prediction accuracy, shown in Fig. 4.
Fig. 4:
Performance comparison of an IMM estimator versus our AMS estimator for the 6 DOF arm. Error bars indicate standard deviation.
TABLE I.
| 3DOF Manipulator | ||||
|---|---|---|---|---|
| Estimation Error |
Tracking Error |
|||
| Joint (rad) | Task (m) | Joint (rad) | Task (m) | |
| N | 1.178 (4.7e-3) | 0.202 (7.1e-4) | 1.008 (4.6e-3) | 0.070 (2.7e-4) |
| IMM | 0.047 (4.8e-5) | 0.052 (5.2e-5) | 0.093 (3.0e-4) | 0.019 (7.2e-5) |
| AMS | 0.045 (5.1e-5) | 0.053 (6.3e-5) | 0.082 (3.0e-4) | 0.020 (1.1e-4) |
| GT | 0.044 (4.8e-5) | 0.044 (4.7e-5) | 0.074 (3.0e-4) | 0.012 (6.3e-5) |
We observe no effect of voting for the 3 DOF manipulator, in that the performance of our estimator with and without voting is comparable; however, in the 6 DOF manipulator experiments, we see about a 10% increase in mode prediction accuracy when using our estimator, compared to using the IMM estimator. This is likely because voting is designed to help when there is ambiguity among which model best describes the system mode. These ambiguities are more prone to occur in the 6 DOF experiments, where there are 32 possible models, compared to in the 3 DOF experiments where there are only 17 possible models.
We also observe that all estimators perform better on the jointspace objective than on the taskspace objective. In the jointspace objective, the manipulator follows a sinusoidal trajectory; it is likely that this trajectory more persistently excites the system when compared to the taskspace trajectory, leading to better estimator performance.
Summary.
We find that our AMS algorithm is more computationally efficient than baselines, while not compromising on performance. In some cases our method outperforms baselines, most notably in mode prediction accuracy. This is due in part because our AMS estimator provides an additional layer of filtering: we only increase the model set when the existing models are poorly explaining the measurements. This prevents the maximum a posteriori probability model from switching as frequently as in the IMM estimator.
V. Robustness to Misspecified Models
Our first experiment analyzed the performance of our AMS algorithm in situations where the models perfectly describe the dynamics of the system. Next, we verify that working with an adaptive model set, rather than the full one as in the IMM algorithm, does not negatively affect accuracy in situations where the models are imperfect: where the noise model is inaccurate, where the link masses are imperfectly characterized, or where we are missing a model of the correct mode altogether. We keep the same 3 DOF manipulator system, as well as the same models, control objective, control law, trials, and dependent measures as in Sec. IV.
Independent Variables.
In addition to the estimation algorithms described in Sec. IV, we further manipulate the following three variables in separate experiments to evaluate robustness: the standard deviation of the simulated sensor noise (in this case ) versus the standard deviation of the modeled sensor noise (in this case ); whether the link masses are mismodeled (i.e. the actual simulated masses are perturbed randomly in the range [−0.2, 0.2] kg from their modeled values used by the estimator); and, whether or not the estimator has access to a model of the mode to which the system switches.
Hypotheses.
H3: Our AMS estimator does not lead to further degradation in performance when faced with (a) mismodeling of the measurement noise; (b) mismodeling of the link masses; and (c) a completely unknown system mode.
Analysis.
For NV = 1, while the errors for our estimator are in some cases greater than for the IMM estimator, it is never by more than 1 degree on average, as shown in Table II, III, and IV. Increasing NV greater than 1 showed no effect on performance for the second and third experiments, and hence the results are not shown in the respective tables. For the first experiment, however, increasing NV to 3, 5 and 7 provided additional robustness to the mismodeling of sensor noise as measured by mode prediction accuracy, as shown in Fig. 5. Despite this, there is not necessarily an improvement in performance from the perspective of estimation and tracking errors in Table II. Furthermore, the geometric complexity in NV of the model set expansion starts to have an effect when NV increases to 5 and 7 votes. Overall, we find that the performance of our AMS estimator is similar to the performance of the IMM estimator, in support of H3.
TABLE II.
| 3DOF Manipulator Mismodelled Sensor Noise (H3 (a)) | ||||
|---|---|---|---|---|
| Estimation Error |
Tracking Error |
|||
| Joint (rad) | Task (m) | Joint (rad) | Task (m) | |
| IMM | 0.090 (1.45e-4) | 0.084 (9.01e-5) | 0.137 (3.19e-4) | 0.026 (7.33e-5) |
| AMS1 | 0.095 (1.71e-4) | 0.095 (1.10e-4) | 0.135 (3.67e-4) | 0.023 (7.24e-5) |
| AMS3 | 0.086 (1.65e-4) | 0.091 (1.07e-4) | 0.146 (5.60e-4) | 0.019 (7.65e-5) |
| AMS5 | 0.090 (3.74e-4) | 0.090 (1.21e-4) | 0.180 (1.21e-3) | 0.019 (8.20e-5) |
| AMS7 | 0.096 (6.15e-4) | 0.090 (1.60e-4) | 0.199 (1.66e-3) | 0.020 (8.61e-5) |
TABLE III.
| 3DOF Manipulator, Incorrect Link Masses (H3 (b)) | ||||
|---|---|---|---|---|
| Estimation Error |
Tracking Error |
|||
| Joint (rad) | Task (m) | Joint (rad) | Task (m) | |
| IMM | 0.060 (1.02e-4) | 0.057 (6.39e-5) | 0.100 (3.01e-4) | 0.020 (7.16e-5) |
| AMS | 0.070 (1.01e-4) | 0.069 (8.07e-5) | 0.104 (3.02e-4) | 0.021 (6.56e-5) |
TABLE IV.
| 3DOF Manipulator Completely Unknown Mode (H3 (c)) | ||||
|---|---|---|---|---|
| Estimation Error |
Tracking Error |
|||
| Joint (rad) | Task (m) | Joint (rad) | Task (m) | |
| IMM | 0.113 (3.86e-4) | 0.075 (2.14e-4) | 0.267 (9.98e-4) | 0.035 (1.67e-4) |
| AMS | 0.124 (5.16e-4) | 0.084 (2.73e-4) | 0.252 (1.03e-3) | 0.028 (1.37e-4) |
Fig. 5:
Evaluating the robustness of our AMS estimator versus an IMM estimator, when sensor noise is mismodeled. The sensor is modeled as having noise with , whereas it is simulated with (i.e. sensor is noisier than modeled). Error bars indicate standard deviation.
Summary.
Our experiments indicate that our estimation algorithm exhibits reasonable robustness to mismodeling when compared to the IMM estimator. Furthermore, our experiments provide evidence that voting may provide robustness to mismodeling of sensor noise; however, we also observe that for NV ≥ 5, the computational advantages of our estimator compared to IMM estimation begin to diminish.
VI. Domain Generalization
So far, our experiments have been restricted to manipulator systems with mode switching caused by some change in the robots own internal dynamics model. In fact, many interesting switching dynamics arise from changing dynamics external to the robot– such as varying environmental conditions or the changing behavior of other agents. For example, a vehicle driving over dirt experiences a change in dynamics that it must compensate for if it suddenly encounters an icy patch on the road, as shown in Fig. 1 (center). Similarly, a robot operating around a human must adapt when the person changes their intention, as shown in Fig. 1 (right). The following 2 experiments evaluate the performance of our AMS estimator in such situations.
Fig. 1:
We apply our adaptive estimator in three domains: (left) Manipulator moves into contact with a table, (center) interaction dynamics between the vehicles wheels and the ground change when it moves from dirt to icy terrain, (right) human switches goal locations as they move through a hallway (path shown in color), affecting a nearby robot’s motion plan (shown in grey arrows).
A. Driving on Variable Terrain
Vehicle Model.
We use a kinematic, skid-steering model of the Robotnik Summit XL robot, shown in Fig. 1. The state is the position, orientation, and linear and angular velocities of the robot with respect to the world frame. The input is a commanded velocity for each of the 4 wheels.
Modes.
For each terrain, we model the effect of friction by setting vk+1 = γvk + (1 – γ)vk,in, where vk+1 is the vehicle velocity at the next time step k+1, vk is the vehicle velocity at the current time step k, and vk,in is the commanded velocity, which we convert from the commanded wheel velocities via the kinematics model. Finally, γ ∈ [0, 1] is a coefficient represents the amount of friction. Dirt. We model friction having negligible effects, so γ = 0. Sand. We model the friction with γ = 0.01. Ice. We model the friction with γ = 0.001. Shallow Mud. Here we set γ = 0; however, we model the vehicle being stuck in mud if its velocity is below 0.3 m/s. Deep Mud. γ = 0 as in the model of shallow mud, but here the vehicle is stuck if its velocity is below 0.6 m/s. Single Wheel Stuck. Here we set γ = 0; however, to model a single wheel being stuck, we simply zero out that wheels commanded velocity.
Simulated Behavior.
We run each experiment for 1000 time steps, sampled every 1/30 s. The vehicle begins on the dirt terrain. After 3 m, the terrain switches to ice.
Control Objective.
We designed a 5 m straight-line reference trajectory in both position and velocity for the vehicle to track. The vehicle starts and ends at zero velocity.
Local Planner.
In order to track the reference trajectory, we employ a local planner, which performs a Dijkstra search to find a sequence of control inputs (or, local plan) that minimizes tracking error and control efforts, with respect to the model provided to it. We employ a horizon of 5 time steps with a time step of 0.25 s. At each time step, we re-run the local planner and execute the first input in the plan it returns. The local planner uses the model with maximum a posteriori probability from the estimator.
Independent Variables.
Same as in Sec. IV.
Dependent Measures.
We measure the mode prediction accuracy, the state estimation error, the position and velocity tracking errors, and the estimator update time, the same as in Sec. IV’s experiment.
Hypotheses.
The same as in Sec. IV’s experiment.
Trials.
For each algorithm, we conduct 20 trials, each with a different random seed.
Analysis.
The results summarized in Table V support both H1 and H2. We see that our AMS estimator outperforms the IMM and nominal estimators with respect to accuracy (both in mode prediction and state estimation) and performance (in tracking the reference trajectory).
TABLE V.
| Driving on Variable Terrain | ||||
|---|---|---|---|---|
| Mode Pred. Accuracy (%) |
Update Time (ms) |
Track. Err. Pos. (m) |
Track. Err. Vel. (m/s) |
|
| N | 34.0 (0.02) | 22 (0.1) | 0.41 (0.019) | 0.11 (0.003) |
| IMM | 79.0 (1.9) | 269 (1) | 1.4 (0.252) | 0.25 (0.023) |
| AMS | 99.3 (0.11) | 96 (0) | 0.03 (0.016) | 0.03 (0.002) |
| GT | 100 (0.0) | 31 (0) | 0.03 (0.015) | 0.02 (0.002) |
Summary.
We find that the AMS estimator works well when facing changes in dynamics arising from factors external to the dynamics of the vehicle– namely a change of terrain.
B. Human Motion Prediction and Robot Navigation
Human Model.
Each mode corresponds to a goal location in the environment that the human could walk to. The estimator uses a model of the human in the form of (1), with
where the ui,k is the angle along a straight-line path from the persons current position to the gith goal position: . T is the time step of the simulation. In the measurement model we have gi(xk) = xk. There is no process noise but the measurement noise has covariance Ri = diag(0.052, 0.052). The indoor environment occupancy map is from [24] and the map and the possible 20 human goals are shown in Fig. 1.
Robot Model.
We model the robot as a 3D Dubins-car vehicle, with control inputs being the speed and steering angle. The robot has a fixed start and goal state across all trials and uses a spline-based motion planner [25]. At each time step, the robot makes a prediction of the human’s goal, taking the maximum a posteriori probability goal from the estimator. The robot subsequently predicts the human’s trajectory, simply extrapolating forward in time by applying a nominal velocity for a number of time steps into the future. We use this predicted trajectory for robot collision checking.
Simulated Behavior.
For each trial, the human starts at the same position, and follows the same mode switching sequence: goal 18 for the first 7 time steps, then goal 14 for the next 8 time steps, then goal 11 for the next 6 time steps, and finally goal 23 for the remainder of the trial (see Fig. 1). We run each trial for 38 time steps (~ 15.2 s).
Independent Variables.
Same as Sec. IV.
Dependent Measures.
We measure the mode (i.e. goal) prediction accuracy, the state estimation error, and the estimator update time, the same as in Sec. IV’s experiment. Furthermore, to capture the safety of the robot’s path, we measure the minimum distance between the robot and the human for the duration of each trial.
Hypotheses.
The same as in Sec. IV’s experiment.
Trials.
For each algorithm, we conduct 20 trials, each with a different random seed.
Analysis.
We summarize the results in Table VI. Our AMS estimator is about twice as fast to update, when compared to IMM, in support of H1. It also achieves comparable performance on all other measures when compared to the IMM estimator, in support of H2. In fact, we observe that the AMS is actually about 6% more accurate at predicting the human’s goal than IMM, consistent with our findings in the other experiments.
TABLE VI.
| Human Goal Prediction & Robot Navigation | ||||
|---|---|---|---|---|
| Mode Pred. Accuracy (%) |
Update Time (ms) |
Safety (m) |
Estimation Err. (m) |
|
| N | 0 (0) | 2 (0) | 0.19 (0) | 3.50 (0.04) |
| IMM | 59.5 (1.4) | 61 (1) | 0.30 (0.04) | 0.05 (0) |
| AMS | 66.0 (2.8) | 30 (1) | 0.32 (0.05) | 0.06 (0.01) |
| GT | 100 (0) | 2 (0) | 0.28 (0.01) | 0.0 (0) |
Summary.
We confirm that the AMS estimator is computationally faster than baseline estimators, and show that it can be employed in settings where the switches arise from the behavior of other agents in the environment.
VII. Evaluation in Hardware
We further evaluate our estimation algorithm in hardware experiments on a Kinova JACO 7 DOF manipulator, shown in Fig. 1. We input desired velocity commands to the manipulator, and have access to measurements of the joint positions via encoders at each of the joints, represented by the output yk+1 = xk+1 + vk+1, where vk+1 is Gaussian noise, as in Sec. II.
Modes.
Nominal. The nominal dynamics model is xk+1 = xk + Tuk + wk. Locked joint. Let Li be a diagonal matrix where the ith diagonal entry is 0, and all others are 1. Then the dynamics model is xk+1 = xk + TLiuk + wk. End-effector contact. Let be the Jacobian of the end-effector position forward kinematics mapping; then the linear velocity of the end-effector at time step k is ve,k = J(xk)uk. Furthermore, let be the Moore-Penrose pseudoinverse of J(xk). Consider a contact with surface normal . Then the dynamics model is xk+1 = xk + Tu0,k + wk where
| (4) |
Control Objective.
In the first experiment, the JACO follows an end-effector position trajectory. In the second experiment, the JACO follows an end-effector pose (position and orientation) trajectory, sliding a grasped object across a table. In the third experiment, the JACO follows an end-effector position trajectory, specified by a planner whose task is to have the end-effector reach two position goals in the robot’s workspace, A and B.
Control Law.
Let Xk, be the homogeneous transformation matrices representing the actual and desired end-effector poses with respect to a common inertial reference frame, respectively. Let be the twist with matrix representation given by . Then the tracking control law is uk = J#(xk)KpVe,k where Kp is a positive-definite diagonal proportional gain matrix, where is the damped least-squares pseudoinverse of the body manipulator Jacobian.
Procedure.
For the first and third experiments, we simulated a locked joint by physically holding that joint in place. For the second experiment, we set the reference trajectory to pass through the table, to ensure contact. In all experiments, we use our AMS algorithm for state estimation.
Analysis.
In the first and third experiments, we find that our estimator is able to reliably identify the locked joint in about 1 s. Once the robot stopped at the end of the reference trajectory, we noticed that the estimator assigned about 50% probability to both the nominal and locked modes– this is expected, since it is not possible to distinguish between these modes when no velocity is commanded at the joint. In the second experiment, we found that our estimator took longer to identify the contact– between 3 to 5 s. This is partially due to certain unmodeled factors occurring during contact, including frictional effects such as stiction, as well as motion of the object in the gripper.
Summary.
Here, we showed the applicability of AMS to predict mode changes on a hardware manipulator system.
VIII. Discussion
Summary.
Robotic systems frequently operate under changing dynamics; to do so effectively requires a good estimator for both the continuous state and dynamics mode. In this work, we propose such an estimator that adapts the set of models at each time step based on a novel algorithm. We provide thorough experimental evaluation in simulation and hardware, to demonstrate the algorithm’s effectiveness for state estimation under changing dynamics– including actuation faults, driving over various terrain, and navigating around a human with uncertain and changing intent.
Limitations and Future Work.
In future work, we plan to explore “active” algorithms that design control inputs to improve estimates and combine online parameter adaptation with our adaptive model set algorithm. Furthermore, our method is limited to a set of a priori specified models; however, adaptive estimators are amenable to learning models online. Finally, we plan to apply our AMS algorithm to discrete, non-Gaussian settings.
Supplementary Material
Acknowledgement
We thank Arno Rogg for his helpful input on robotic manipulator faults.
This paper was recommended for publication by Editor Dana Kulic upon evaluation of the Associate Editor and Reviewers’ comments. This work was supported by a NASA Space Technology Research Fellowship and an NSF CAREER award.
References
- [1].Blom HAP. “An efficient filter for abruptly changing systems”. The 23rd IEEE Conference on Decision and Control. December. 1984. [Google Scholar]
- [2].Blom HAP and Bar-Shalom Y. “The interacting multiple model algorithm for systems with Markovian switching coefficients”. IEEE Transactions on Automatic Control (1988). [Google Scholar]
- [3].Hofbaur M. Hybrid Estimation of Complex Systems. Vol. 319. Lecture Notes in Control and Information Sciences. Germany: Springer Verlag, 2005. [Google Scholar]
- [4].Rong Li X. “Multiple-model estimation with variable structure. II. Model-set adaptation”. IEEE Transactions on Automatic Control 45.11 (November. 2000). [Google Scholar]
- [5].Li XR. “Hybrid Estimation Techniques”. Stochastic Digital Control System Techniques. Ed. by Leondes CT. Vol. 76. Control and Dynamic Systems. Academic Press, 1996. [Google Scholar]
- [6].Maybeck P. Stochastic Models, Estimation, and Control. ISSN. Elsevier Science, 1982. [Google Scholar]
- [7].Inseok Hwang, Balakrishnan H, and Tomlin C. “Performance analysis of hybrid estimation algorithms”. 42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475). Vol. 5. December. 2003. [Google Scholar]
- [8].Ackerson G and Fu K. “On state estimation in switching environments”. IEEE Transactions on Automatic Control 15.1 (February. 1970). [Google Scholar]
- [9].Chang CB and Athans M. “State Estimation for Discrete Systems with Switching Parameters”. IEEE Transactions on Aerospace and Electronic Systems AES-14.3 (May 1978). [Google Scholar]
- [10].Jaffer AG and Gupta SC. “On estimation of discrete processes under multiplicative and additive noise conditions”. Information Sciences 3.3 (1971). [Google Scholar]
- [11].Cully A, Clune J, Tarapore D, and Mouret J-B. “Robots that can adapt like animals”. Nature 521.7553 (2015). [DOI] [PubMed] [Google Scholar]
- [12].Solowjow F and Trimpe S. “Event-triggered Learning”. Automatica 117 (July 2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Harris TJ, Boudreau F, and Macgregor JF. “Performance assessment of multivariable feedback controllers”. Automatica 32.11 (1996). [Google Scholar]
- [14].Haruno M, Wolpert DM, and Kawato M. “Mosaic model for sensorimotor learning and control”. Neural computation 13.10 (2001). [DOI] [PubMed] [Google Scholar]
- [15].Koert D, Trick S, Ewerton M, et al. “Online learning of an open-ended skill library for collaborative tasks”. 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids). IEEE. 2018. [Google Scholar]
- [16].Maeda G, Ewerton M, Osa T, et al. “Active incremental learning of robot movement primitives”. 2017. [Google Scholar]
- [17].Li Xiao-Rong and Bar-Shalom Y. “Multiple-model estimation with variable structure”. IEEE Transactions on Automatic Control 41.4 (April. 1996). [Google Scholar]
- [18].Li XR and Jilkov VP. “Survey of maneuvering target tracking. Part V. Multiple-model methods”. IEEE Transactions on Aerospace and Electronic Systems 41.4 (2005). [Google Scholar]
- [19].Li X-R, Bar-Shalom Y, and Blair WD. “Engineer’s guide to variable-structure multiple-model estimation for tracking”. Multitarget-multisensor tracking: Applications and advances. 3 (2000). [Google Scholar]
- [20].Li XR, Zwi X, and Zwang Y. “Multiple-model estimation with variable structure. III. Model-group switching algorithm”. IEEE Transactions on Aerospace and Electronic Systems 35.1 (1999). [Google Scholar]
- [21].Li XR, Zhang Y, and Zhi X. “Multiple-model estimation with variable structure. IV. Design and evaluation of model-group switching algorithm”. IEEE Transactions on Aerospace and Electronic Systems 35.1 (1999). [Google Scholar]
- [22].Hand K, Murray A, Garvin J, et al. “Europa Lander Mission. Europa Lander Study 2016 Report”. NASA, Tech. Rep JPL D-97667 (2016). [Google Scholar]
- [23].Murray RM, Li Z, Sastry SS, and Sastry SS. A mathematical introduction to robotic manipulation. CRC press, 1994. [Google Scholar]
- [24].van Opdenbosch D, Schroth G, Huitl R, et al. “Camera-based Indoor Positioning using Scalable Streaming of Compressed Binary Image Signatures”. IEEE International Conference on Image Processing (ICIP 2014). 2014. [Google Scholar]
- [25].Walambe R, Agarwal N, Kale S, and Joshi V. “Optimal trajectory generation for car-type mobile robot using spline interpolation”. IFAC-PapersOnLine 49.1 (2016). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





