Cerebellar-inspired algorithm for adaptive control of nonlinear dielectric elastomer-based artificial muscle

Emma D Wilson; Tareq Assaf; Martin J Pearson; Jonathan M Rossiter; Sean R Anderson; John Porrill; Paul Dean

doi:10.1098/rsif.2016.0547

. 2016 Sep;13(122):20160547. doi: 10.1098/rsif.2016.0547

Cerebellar-inspired algorithm for adaptive control of nonlinear dielectric elastomer-based artificial muscle

Emma D Wilson ^1,^2,^✉, Tareq Assaf ³, Martin J Pearson ³, Jonathan M Rossiter ^3,⁴, Sean R Anderson ^1,⁵, John Porrill ^1,², Paul Dean ^1,²

PMCID: PMC5046955 PMID: 27655667

Abstract

Electroactive polymer actuators are important for soft robotics, but can be difficult to control because of compliance, creep and nonlinearities. Because biological control mechanisms have evolved to deal with such problems, we investigated whether a control scheme based on the cerebellum would be useful for controlling a nonlinear dielectric elastomer actuator, a class of artificial muscle. The cerebellum was represented by the adaptive filter model, and acted in parallel with a brainstem, an approximate inverse plant model. The recurrent connections between the two allowed for direct use of sensory error to adjust motor commands. Accurate tracking of a displacement command in the actuator's nonlinear range was achieved by either semi-linear basis functions in the cerebellar model or semi-linear functions in the brainstem corresponding to recruitment in biological muscle. In addition, allowing transfer of training between cerebellum and brainstem as has been observed in the vestibulo-ocular reflex prevented the steady increase in cerebellar output otherwise required to deal with creep. The extensibility and relative simplicity of the cerebellar-based adaptive-inverse control scheme suggests that it is a plausible candidate for controlling this type of actuator. Moreover, its performance highlights important features of biological control, particularly nonlinear basis functions, recruitment and transfer of training.

Keywords: cerebellum, artificial muscle, adaptive-inverse control, soft robotics, nonlinear control, transfer of training

1. Introduction

Making robots ‘soft’ significantly increases the range of environments in which they can operate, allowing them, for example, to interact safely with people (for recent review, see [1]). However, robots made wholly or in part from materials that change the shape when subjected to force are more difficult to control than rigid robots [2].

This is true for compliant actuators, capable of muscle-like high strain, which have been manufactured from a wide variety of materials including electroactive polymers (EAPs) [3] that can undergo large deformations in response to electrical stimuli. Dielectric elastomer actuators (DEAs) are an example of compliant EAP-based actuators with high energy density, large strain capability and a relatively fast response [4]. As such, they possess many of the desirable properties of biological muscle [5] and have attracted significant interest in the field of soft robotics research. However, even with recent advances in materials science and manufacturing processes, the precise control of DEAs remains a non-trivial problem owing to a number of intrinsic nonlinear and time variant characteristics as illustrated schematically in figure 1.

Figure 1. — Dielectric elastomer actuators (DEAs) are difficult to control. (a) Sketch of DEA operation. Voltage applied to the electrodes produces electrostatic pressure that squeezes and expands the elastomeric film between them. When the voltage is switched off, the film returns to its original shape (cf. [6]). (b) Time course of displacement response to a step change in voltage (ordinate shows voltage prior to amplification by a factor of 800). The time course can be approximated by a single exponential, with time course in this case of approximately 100 ms [7]. The responses shown in this and the subsequent panels were obtained from DEAs made of acrylic elastomer (3M VHB 4905) with conductive layers of carbon grease as the electrode plates [7,8] (further details in Methods.). The schematic response shown here is derived from the nonlinear Hammerstein model developed by Wilson *et al*. [7] that accounts for 96–98.8% of the variance in the responses of six DEA samples to filtered white noise. (c) The top trace shows the coloured-noise voltage input (prior to amplification, cf. panel b) over a 30 min period of stimulation. The bottom trace shows the corresponding displacement response of a DEA sample. The response gradually changes (‘creeps’) over the 30 min period. (d) Data from panel c replotted to show displacement as a function of voltage for successive time periods as indicated by the colour scale. The displacement response is nonlinear, displays hysteresis, and varies over time (from fig. 1e of [8]).

When a membrane of elastomer is sandwiched between two compliant electrodes, applying a voltage to the electrodes causes the membrane to flatten and expand (figure 1a). A typical time course for this response to step changes in voltage is shown in figure 1b, where steady state is reached only after a substantial delay (in this case, approx. 300 ms). With a coloured-noise voltage input delivered for 30 s, the displacement response gradually changes (figure 1c). When these data are plotted as voltage versus displacement at different time points (figure 1d), it can also be seen that the response is a nonlinear function of input voltage and shows hysteresis, as well as increasing in amplitude with time (figure 1d). Furthermore, not shown in the figure, significant effort is required in the manufacturing process of DEAs to reduce variance in the response between individual actuators; they are sensitive to temperature; and, when loaded, prone to failure and, for acrylic elastomers, systematic degradation over time. These issues and phenomena are apparent in both dielectric- and ionic EAP-based actuators [3,9] and constitute one of the main challenges to overcome before the technology can be incorporated more broadly into robotic systems. There is ongoing research into improving the material properties of DEAs, such as by using silicone, to address these challenges. However, this research focuses on control.

The similarities between DEAs and biological muscles referred to above extend to these control problems, which also characterize biological muscles. The question therefore arises of whether biological control strategies, which have evolved to deal with compliant materials, might show promise for the control of DEA-based actuators. These strategies are probably best understood for the extraocular muscles (EOMs) that control the eye, because for these muscles, the poorly understood effects of proprioception are less prominent than for skeletal muscles, and their neural control machinery does not involve the very complex organization of the spinal cord [10]. In broad terms, it appears that eye-movement-related neurons in the brainstem implement an approximate inverse model of the oculomotor plant, i.e. the EOMs and orbital tissue [11,12]. This approximate model is calibrated by the cerebellum, which is thought to ensure eye-movement accuracy by using a form of supervised learning, in which information about movement inaccuracy adjusts weights in a specialized neural network [13]. The combination of brainstem model and continual cerebellar calibration appears able to cope with the kinds of control problems illustrated in figure 1, as manifested by the oculomotor plant.

We therefore investigated how far a similar scheme could be used to control DEA [7] by employing a modified version of a simplified model of the cerebellum and brainstem circuitry, previously developed in the context of oculomotor plant compensation [14,15]. In this model (figures 2 and 3: details in following sections), the cerebellum is represented by an adaptive filter [16,17] whose input is an efference copy of the commands sent to the plant. A measure of movement inaccuracy (retinal slip in the case of the oculomotor system) is sent to the adaptive filter as an error signal. The standard least mean square (LMS) learning rule is then used to adjust the adaptive-filter weights, so that the error is reduced, an example of adaptive-inverse control [18]. Application of this recurrent-architecture scheme to DEAs within their linear range of operation (figure 1d) produced accurate control of displacement despite variation in dynamics between actuators, and within an actuator as a function of time (figure 1c,d).

Figure 2. — (a) Cerebellar microcircuit as an adaptive filter. (a) Highly simplified diagram of cerebellar cortical microcircuit. Details in text. Not shown are Golgi cells, which receive input from mossy and parallel fibres and send inhibitory projections back to the synapses between mossy fibres and granule cells. This recurrent inhibitory network contributes to the recoding of mossy fibre inputs by granule cells (Discussion). (b) Interpretation of cerebellar microcircuit as an adaptive linear filter. Details in text. (c) Alpha function basis. Normalized impulse responses of alpha basis functions. (Online version in colour.)

Figure 3. — Basic architecture for motor plant compensation. (a) Linearized model of the horizontal VOR, the reflex that stabilizes images on the retina by reducing retinal slip. The vestibular system (not shown) generates a head velocity signal v_h. Retinal slip (error, e) is zero when the eye velocity v_e exactly opposes the head velocity v_h. Control of the oculomotor plant (P) is provided by a combination of a brainstem filter (B) and recurrently connected adaptive cerebellar filter (C). (b) Architecture for position control of a nonlinear DEA plant using a control scheme based on the VOR. Here, compensation is again provided by a combination of B and C; however, the position as opposed to velocity is controlled, a reference model (M) is included such that a filtered version of the reference input is tracked, and the elements represented in the diagram are not necessary linear filters. (Online version in colour.)

Here, we seek to extend these findings to the nonlinear range of DEA operation (figure 1d), by altering the linear model in three ways. First, the adaptive filter model is expanded to allow it to produce nonlinear outputs, using a thresholding scheme similar to that described by Spanne & Jörntell [19] which is based on the properties of neural processing in the granular layer of the cerebellum. Second, the brainstem model is also expanded to allow the production of nonlinear outputs, in this case by mimicking the effects of recruitment. Biological muscles are composed of motor units arranged in parallel, with each unit controlled by its own motoneuron (for most muscles). To increase the force exerted by the muscle, the control signal sent to the motoneuron pool changes its firing in two ways. One is an increase in the number of motoneurons firing (recruitment), the other is an increase in the firing rate of those motoneurons already recruited [20]. Because later recruited units are typically more powerful than those with lower thresholds for both skeletal muscles [21] and probably EOMs [22], a nonlinearity of the kind shown in figure 1d could, in principle, be accommodated by appropriate recruitment. Finally, an additional learning mechanism is introduced that allows cerebellar output to ‘teach’ the brainstem, thereby allowing the transfer of large gains from the cerebellum to the brainstem. Transfer of this kind has been observed in the oculomotor system (references in [23]).

Evaluating this bioinspired control scheme for DEAs has implications not only for the control of DEA-based actuators, but also for understanding cerebellar function. Webb [24] explains the general usefulness of robotics for clarifying and evaluating hypotheses in neuroscience: here, the specific hypotheses concern the competencies of the adaptive filter model of the cerebellum and the recurrent architecture for the control of compliant actuators.

The paper is structured as follows. Methods section describes first the components of the algorithm that is the adaptive filter model of the cerebellar microcircuit and the recurrent architecture for plant compensation. It then outlines the changes made to the algorithm to deal with DEA nonlinearities, resulting in three new control schemes, and in the final section describes the experimental set-up. The Results section shows the effects of applying the new control schemes compared with conventional PID control, and the Discussion section considers their limits and significance. Finally, appendix A provides the mathematical details of the control algorithms.

2. Methods

2.1. Cerebellum: the adaptive filter model

The cerebellar cortical microcircuit can be modelled as an adaptive filter [16,17]. The main features of the microcircuit are shown schematically in figure 2a, and translated into adaptive-filter form in figure 2b. In this model, the main cerebellar inputs carried by mossy fibres (figure 2a) are represented by u. These are recoded by a bank of fixed filters G₁ … G_N corresponding to processing in the granular layer, giving rise to outputs p₁ … p_N that correspond to signals in parallel-fibres. The parallel-fibre signals are weighted (w₁ … w_N, corresponding to synapses between parallel fibres and Purkinje cells) and summed linearly (by Purkinje cells) to give the filter output z. The Purkinje cells also receive input via a single climbing fibre. This input acts as a teaching signal (in the simulations presented here the teaching signal is the tracking error e, that is the difference between actual and desired actuator position). The Purkinje cell synaptic weights are modified over time according to the covariance learning rule Inline graphic , which corresponds to the LMS learning rule [25].

Much of the power of the adaptive filter depends on how far the basis filters G₁, … , G_n provide a rich recoding of the input, allowing synthesis of a large range of desired outputs. In engineering applications, the basis is often taken to be a bank of tapped delay lines. However, a very large number of delay lines may be required to represent the long time-constant behaviours characteristic of biological systems. We therefore use an alternative basis better adapted to biological control, namely a set of alpha functions [7] in which the average delay increases logarithmically (figure 2c). These cover a large range of time constants very economically, although filter width increases proportionally to delay giving less accurate time-location at increasing delay.

Both log-spaced alpha functions (and tapped delay lines) have highly correlated outputs that drastically affect the speed of learning. For learning rates to be maximized, the basis filter outputs must be mutually uncorrelated and have equal power [26]. It is thought that unsupervised plasticity mechanisms within the granular layer may reduce correlations between granule cell outputs [27]. We model these decorrelation processes by applying a further processing stage to the filter outputs, represented by the unmixing matrix Q in figure 2b. This matrix is estimated using singular value decomposition based on a batch of filter outputs to provide uncorrelated, unit power, parallel fibre signals [7].

Although the cerebellum is involved in a very wide variety of tasks, the microcircuit itself is relatively homogeneous over the entire cortex [13]. This implies that the same adaptive filter model underlies many different processing tasks, so a fundamental design rule for our biomimetic control scheme is that the basic filter design should not be modified in ad hoc ways for different control applications. Instead, task-specific processing is obtained by embedding the adaptive filter in a range of different connectivities [12].

2.2. Recurrent architecture

In the linear case embedding, the cerebellar learning element in a recurrent architecture (figure 3a) simplifies the adaptive control problem [14,15]. In this architecture, inspired by the organization of the cerebellar flocculus and oculomotor brainstem to maintain stability of eye gaze, referred to as the vestibulo-ocular reflex (VOR), the controller has two main parts.

(1) The fixed brainstem part of the controller B converts a signal representing head velocity v_h into a control signal u which is sent to the oculomotor plant P. In the VOR, the task is to move the eyes in the opposite direction to the head, so that eye velocity v_e is equal to −v_h, thereby stabilizing the image on the retina. The brainstem constitutes an approximate inverse of the plant (P⁻¹).
(2) The adaptive part of the controller C receives an efferent copy of the motor commands u generated by the brainstem. If these commands are inaccurate, then the resultant eye movements will not match the head movements, and the image will move across the retina generating a retinal slip-error signal e. This signal drives learning in C, which adjusts its output z to the brainstem so as to reduce e. When learning is complete the combined controller approximates the inverse of the plant transfer function [18], and the cerebellum has learnt an incremental plant model C = B⁻¹ – P.

An important feature of the recurrent architecture shown in figure 3a is that it can use sensory errors to drive adaptation directly, rather than needing to estimate what the required motor command should have been [12,28]. In particular, it guarantees that the teaching signal required for stability and convergence is simply the tracking error rather than a more complex teaching signal [15].

Figure 3b shows how the basic recurrent architecture was altered for control of a DEA in its linear operating range, using a biohybrid approach that incorporates model reference control [7]. After learning, the behaviour of the controlled plant matches that of the reference model M (i.e. it tracks y which is a filtered version of r) which specifies a realistic response for the controlled plant; the use of a reference model also ensures that the estimated controller is proper. Using model reference, adaptive control is a technical solution that enables the cerebellar algorithm to function independently of the plant order.

2.3. Dealing with nonlinearity

Nonlinear plants do not have transfer functions, but the same concept of plant compensation (inverse control) holds if the plant has an inverse that is stable [29]. We assume here that the DEA plant has an inverse that is stable (i.e. bounded output implies bounded plant input), a reasonable assumption given that the input signal must always be kept small enough to avoid damage. For the DEAs used in this study, the plant can be represented by a Hammerstein model [7], that is as a static nonlinearity (SNL) followed by a linear dynamic system (LDS; figure 4a). Such a plant can be perfectly compensated if the controller contains an LDS equal to the inverse of the plant LDS followed by an SNL equal to the inverse of the plant SNL (figure 4b).

Here, we use a series of piecewise linear elements to approximate the continuous nonlinear function that constitutes the SNL, as shown figure 4c (equation (A 9) in appendix A). Two methods were tried, both of them bioinspired and consistent with the basic circuitry of the adaptive filter and the recurrent architecture.

(1) One of the features of recurrent inhibition in the granular layer is that it can provide a natural thresholding mechanism for granule cell responses. Spanne & Jörntell [19] have argued that the resulting threshold-linear processing elements may be useful for nonlinear control problems. We therefore incorporated a bank of threshold-linear elements with varying threshold as a pre-processing stage (see figure 4d and equations (A 6) and (A 7) in appendix A) providing a flexible set of nonlinear basis filters.
(2) Threshold nonlinear elements are also found in the brainstem. Oculomotor neurons have a wide range of thresholds [30], and it has been suggested that recruitment can be used to linearize nonlinear plants [31]. We therefore investigated whether a bank of threshold linear units in the brainstem (figure 4e) could compensate for the DEA plant nonlinearity.

The final control scheme to be examined included an additional site of plasticity in the brainstem (equation (A 10) in appendix A), inspired by the existence of such a site in the vestibular nuclei that allows the cerebellar input to drive brainstem learning during VOR adaptation [32]. This mechanism can be used to transfer models learnt in the cerebellum to the brainstem [23], and predicts a heterosynaptic learning rule using correlations between the brainstem input and the inhibitory cerebellar input drive that has been verified experimentally [33]. An advantage of learning transfer is that it limits the amount of gain that is required to be stored in the cerebellar loop, improving loop stability if the plant is subject to large changes over time.

2.4. Experimental set-up

The experimental set-up was the same as that described previously in Wilson et al. [7]. The control task was to drive the 1 degree of freedom displacement response of the DEA to track a filtered coloured-noise reference signal y such that the controlled actuator behaved as specified by the reference model M (figure 3b). Each DEA consisted of a thin, passive elastomeric film, sandwiched between two compliant electrodes (figure 5a). Voltage applied to the electrodes squeezed the film and expanded it biaxially. To constrain the controlled variable to 1 degree of freedom, a spherical load was placed at the centre of a circular DEA and its motion in the vertical plane (i.e. vertical displacement) was measured (figure 5a,b).

Figure 5. — Experimental set-up. (a) Photograph of experimental set-up for measuring the vertical displacement of a DEA stretched on a circular Perspex frame supporting a spherical load, using a laser displacement sensor. (b) Diagram of the experimental set-up, showing displacement x. (Adapted from fig. 2a and b of [7].) (Online version in colour.)

The DEAs were made of acrylic elastomer (3M VHB 4905) with an initial thickness of 0.5 mm. This material was chosen owing to its low cost, availability, robustness and adhesive properties that were exploited in the assembly process. The elastomer was pre-stretched biaxially by 350% (where 100% was the unstretched length) to a thickness of approximately 41 µm (unmeasured) prior to being fixed on a rigid Perspex frame with inner and outer diameters of 80 and 120 mm, respectively. A conductive layer of carbon grease (MG chemicals) formed the electrodes that were brushed on both sides of the VHB membrane as circles with a diameter of approximately 35 mm. The load used during experiments was a sphere weighing 3 g.

The control algorithm (table 1) was implemented in LabVIEW and from there embodied in a CompactRio (CRIO-9014, National Instruments) platform, with input module NI-9144 (National Instruments) and output module NI-9264 (National Instruments) used in combination with a host laptop computer. LabVIEW was run on the host laptop computer, with communication between the host laptop and CompactRio (CRio) carried out, using the LabVIEW shared variable engine. In all experiments, all signals were sampled simultaneously with a sampling frequency of 50 Hz.

Table 1.

Plant compensation control algorithm. Algorithm used to control the response of a DEA. The timing was done using a National Instruments Compact Rio with labVIEW software. Read/write used a read-write National Instruments FPGA module (see Methods). The delay between steps 8–9 was 0.0001 s.

	control algorithm	for each time step, k
1	y_k = M(q, τ)r_k	filter input signal through reference model
2	q_k = f₂(u_k−1)	nonlinear transformation of (previous) motor command
3	for i = 1 : n_f do g_i,k = G_i(q, T_i)q_k end for	filter transformed motor commands through bank of alpha filters
4	p_k = Qg_k	transform filter outputs into a faster learning basis
5		calculate adaptive filter output
6	v_k = B_L(q, γ)(r_k + z_k)	filter adaptive filter output and input signal through linear brainstem filter
7		calculate output of piecewise linear, nonlinear brainstem element
8	WRITE u_k	use motor command to drive DEAP
9	READ x_k	measure response of DEAP
10	e_k = x_k−y_k	calculate error between desired and actual response
11		filter parallel fibre signals through reference model
12		update adaptive filter weights
14	for j = 1 : m do if j < 2 g_j_,k ₊ ₁ = g_j_,k + ζz_kμ_j,k otherwise g_j,k ₊ ₁ = g_j,k + ζμ_kv_j,k−ζz_kμ_j−1,k end for	update gains of piecewise linear brainstem element

Open in a new tab

A laser displacement sensor (Keyence LK-G152, repeatability—0.02 mm) was used to measure the vertical movement of the mass sitting on the circular DEA. This signal was supplied to the input module of the CRio. From the output module of the CRio, voltages were passed through a potentiometer (HA-151A HD Hokuto Denko) and amplified (EMCO F-121 high-voltage module) with a ratio of 15 V : 12 kV and applied to the DEA.

2.5. Control schemes

Six control schemes were applied to the DEA shown in figure 5. In each case, the actuator was required to track for 900 s a low-pass filtered (1 Hz cut-off) white-noise voltage input, with a range of desired displacement amplitudes of 0.1–1.8 mm. This amplitude range corresponds to average motor commands (voltage inputs to the DEA) of the order of 3 V prior to amplification. These inputs excite the full nonlinear dynamics of the DEA.

Five schemes used a model brainstem and recurrently connected cerebellar adaptive filter to compensate for the DEA dynamics, an arrangement previously suggested for compensation of the oculomotor plant in animals and humans. All were tested in simulation, and the fifth also applied experimentally. In addition, a PID-based control scheme was tested in simulation for comparison.

3. Results

The first control scheme applied to the DEA (see Methods) used the linear brainstem and cerebellar models (figure 6a) previously applied to both simulated and experimental control of the DEA in its linear range [7]. The performance of the fixed linear brainstem (defined in table 2) before and after learning is shown in figure 6b,c. As expected, the linear control scheme cannot fully compensate for the nonlinear plant dynamics, having particular trouble tracking larger peaks in the desired displacement response. Its use, here as a reference condition, gives an indication of the problems caused by the nonlinearity, with its steady-state RMS error (figure 6d) being 0.04 mm. For comparison, the linear control scheme gives steady-state RMS errors of 0.011 mm when the DEA is excited over a reduced range (i.e. reference signal reduced to a maximum of 1 mm), such that the dynamics can be approximated as linear [7]).

Table 2.

Parameters for experiments. Parameters used to control the response of a DEA. The third experiment was linear PID control for which control parameters are provided in appendix A.

parameter	experiment	value
number of piecewise linear brainstem terms	first	m = 1
	second	m = 1
	fourth	m = 8
	fifth	m = 8
	sixth	m = 8
thresholds for brainstem piecewise linear terms	first	ρ₁ = 0
	second	ρ₁ = 0
	fourth	ρ₁₋₈ = [0 0.255 0.51 0.765 1.02 1.275 1.53 1.785]
	fifth	ρ₁₋₈ = [0 0.255 0.51 0.765 1.02 1.275 1.53 1.785]
	sixth	ρ₁₋₈ = [0 0.255 0.51 0.765 1.02 1.275 1.53 1.785]
initial brainstem gains	first	g₀₋₁ = [2.1 0.9]
	second	g₀₋₁ = [2.1 0.9]
	fourth	g₀₋₈ = [0.92 2.38 1.07 −1.92 −0.78 −0.11 −0.12 −0.045 0]
	fifth	g₀₋₈ = [0.92 2.38 1.07 −1.92 −0.78 −0.11 −0.12 −0.045 0]
	sixth	g₀₋₁ = [2.1 0.9 0 0 0 0 0 0 0]
rate of learning brainstem gains	first	ζ = 0
	second	ζ = 0
	fourth	ζ = 0
	fifth	ζ = 0
	sixth	ζ = 0.01
number of nonlinear cerebellar elements	first	ν = 0
	second	ν = 5
	fourth	ν = 0
	fifth	ν = 5
	sixth	ν = 5
thresholds for nonlinear cerebellar elements	first	n.a.
	second	σ₁₋₅ = [2.18 2.48 2.78 3.08 3.38]
	fourth	n.a.
	fifth	σ₁₋₅ = [2.18 2.48 2.78 3.08 3.38]
	sixth	σ₁₋₅ = [2.18 2.48 2.78 3.08 3.38]
discrete alpha basis filters	all
number of alpha filters	all	n_f = 4
time constants of alpha filters	all	log-spaced from T₁ = 0.1 to T₄ = 0.5
fixed cerebellar bias	all
rate of error learning	all	β = 8
discrete linear brainstem filter	all	B_L(q, γ) = 0.66−0.48q⁻¹/1−0.82q⁻¹
discrete linear reference filter	all	M(q, τ) = 0.18/1−0.82q⁻¹

Open in a new tab

The performance of the second control scheme, in which a nonlinear adaptive cerebellum replaces the linear adaptive cerebellum of the first scheme, is also shown in figure 6. It learns to compensate well for the nonlinear plant, and the desired displacement response is accurately tracked over the full range of displacements, including larger peaks (figure 6b,c). This improvement is reflected in lower RMS errors (figure 6d: 0.019 mm). The number of nonlinear cerebellar elements required to achieve this reduction in error is approximately 5 (figure 6e).

Finally, the PID controller initially performed better than either adaptive scheme (figure 6d). As learning proceeded, the linear adaptive scheme came to perform similarly as indicated by RMS error, whereas the nonlinear scheme did slightly better.

The fourth control scheme to be investigated used a linear adaptive cerebellum as in the first scheme, but combined it with a nonlinear brainstem intended to capture the effects of motor unit recruitment in skeletal and EOMs (figure 7a). Its eventual performance was slightly worse than that of the second scheme (figure 7b; average final RMS errors of 0.030 mm), and learning was somewhat slower.

Figure 7. — Comparison of nonlinear control strategies. Simulated results when applying different nonlinear control strategies to control of the DEA. (a) Diagram of the four nonlinear control schemes. An arrow indicates an adaptive element, and a shaded box a nonlinear element. Results for linear brainstem and nonlinear cerebellum (red lines) previously shown in figure 6. (b) Windowed RMS errors for each control scheme. (c) Cerebellar output for each control schemes. For the three schemes in which the brainstem is fixed, the cerebellar output increases over time, as the properties of the DEA change (‘creep’). When the brainstem is adaptive because of learning transferred from the cerebellum, the cerebellar output does not increase over time. (d) Evolution of cerebellar weights over time for each control scheme. Note the y-axis scale on the plot on the right is 10× smaller than the other plots.

In the fifth and sixth control schemes, both the brainstem and cerebellum were nonlinear, but whereas in the fifth scheme, the brainstem remained fixed, in the sixth it was adaptive (figure 7a) with learning driven by changes in cerebellar output, as can occur in VOR adaptation. Both schemes produced good learning (steady-state RMS errors 0.015 and 0.011 mm, respectively), a value for the sixth scheme that matches the steady-state RMS errors when controlling the DEA over a reduced linear range, using a linear control scheme. In addition, the fifth scheme's method of achieving this level of performance was different. Figure 7c shows how cerebellar output varies over time for each of the three nonlinear schemes. If there is no transfer of learning between cerebellum and brainstem (schemes two to four), then this output gradually increases to cope with the slow ‘creep’ of plant properties (figure 1c). Such continual increase is undesirable, especially when the cerebellum is connected in a recurrent loop, so that large cerebellar outputs are effectively large gains in a feedback loop and can thus cause instabilities. However, when a nonlinear adaptive brainstem element is used and learning is transferred from the cerebellum to the brainstem the cerebellum output no longer increases continually over time (figure 7c). These differences between the control schemes are also reflected in the evolution of cerebellar weights as learning proceeds (figure 7d). In particular, weight change is very much reduced and stabilized when transfer to the brainstem is allowed (figure 7d, right-most panel).

Finally, the sixth control scheme was applied to displacement control of the real-world DEA system, and the resulting performance compared with that seen in the simulation (figure 8a). After learning, both the simulated and real-world systems track the desired displacement response accurately. It appears that the model of the DEA used in the simulations provides a reasonable description of its dynamics, and that the control algorithm works as expected on a real-world system. RMS error is shown in figure 8b, and cerebellar output in figure 8c.

The learnt brainstem nonlinearity (from an initially linear estimate) was compared with the estimated inverse of the plant nonlinearity for both the simulated and real-world systems (figure 8d). The specific form of the plant nonlinearity differs between the real-world and simulated systems owing to variations in the characteristics of individual actuators [8], though the general form of the nonlinearity is similar. In both simulated and the real-world systems, the learnt brainstem nonlinearity reasonably approximates the inverse of the plant nonlinearity (for ideal compensation, the two should be equal). The approximation is less good for large and small displacements, probably because there are fewer data available to learn over these ranges.

For the results shown in figure 8, the transfer of learning from the cerebellum to brainstem was calculated using a learning rule in which previous gains are taken into account (equation (A 10) in appendix A) to provide some decorrelation of the signals being weighted. A simpler learning rule that does not include the effect of previous gains was also tested on the simulated system and gave very similar performance to that shown in figure 8 (results not shown).

4. Discussion

These results show that a bioinspired control scheme, based on cerebellar calibration of the VOR, is capable of compensating for the plant nonlinearities of a DEA-based actuator. Good performance was obtained with either an adaptive (cerebellar) filter using nonlinear basis functions, or a fixed brainstem nonlinearity based on recruitment of EOM. In addition, a biologically based arrangement, in which the adaptive filter teaches the brainstem model of the inverse plant, allowed the amplitude of cerebellar output to remain relatively stationary even though plant properties gradually changed with time.

We consider the implications of these findings first for EAP control, then for understanding biological control. Finally, we discuss possibilities for future work.

4.1. Electroactive polymer control

A wide variety of control schemes have been proposed for both ionic and dielectric EAs [9,34–40] and, at present, there appears to be no consensus about which of them is most suitable.

The schemes particularly relevant to this study are those involving inverse control. Some use non-adaptive methods, deriving a plant model by system identification techniques then inverting it (with appropriate safeguards) [34,36,37,39]. Of the studies that do involve adaptive methods, Hao & Li [35] use on online LMS algorithm to identify hysteresis parameters online, and a separate offline identification algorithm to obtain creep parameters. Sarban & Jones [38] derive a physical-based electromechanical model of the DEA, and estimate values for its 14 parameters. Druitt & Alici [9] argue that the problems of explicit modelling can be avoided by using intelligent controllers such as those based on fuzzy logic or neural networks, and demonstrate the utility of a neurofuzzy adaptive neural fuzzy inference system.

Our approach also seeks to reduce the need for offline system identification by using only a relatively crude inverse model in the ‘brainstem’, and in addition employs an adaptive filter as the intelligent part of the control system rather than a complex adaptive neural fuzzy inference system. Moreover, the brainstem model can be taught, which both reduces dependence on a priori estimates, and is also particularly suitable for tracking slow changes in performance (‘creep’) without long-term increases in adaptive-controller output. Finally, the basic structure of the control scheme suggests immediate possibilities for compensating for temperature effects or poor manufacturing tolerances, for implementing impedance control in agonist–antagonist EAPs, and for augmenting feedback in mixed feedback–feedforward control schemes (discussed further in §4.3.).

4.2. Biological control

The importance of using robots to test hypotheses about neural function is well recognized [24,41], and previous work has explored how cerebellar-inspired control schemes could be applied to robots [42–45]. The success of the adaptive-filter model embedded in the recurrent architecture in controlling DEAs in their linear range [7] prompted its extension here to the nonlinear range. The results have three implications for understanding neural function.

The first concerns the adaptive filter model of the cerebellar microcircuit. How granular layer processing could generate the equivalent of basis filters is not well understood, although current approaches using insights from reservoir computing are attracting interest [46,47]. These treat the granular layer as a recurrent inhibitory network, in which granule cells project to inhibitory Golgi cells which, in turn, project back to the synapses between mossy fibres and granule cells (figure 2a). If the recurrent inhibition is allowed to change rapidly, then the resultant dynamics are very rich and can generate a wide variety of basis functions [47]. However, some of the Golgi cell inhibition appears to change very slowly, which has led to the suggestion that the granular layer generates piecewise linear approximations of nonlinear functions [19]. The present results indicate that such basis functions can be used, in practice, to compensate for certain kinds of nonlinear plant.

Second, it appears that a distributed representation of the approximate inverse model in the brainstem [12] can also help to compensate for the same kind of nonlinearity. In the oculomotor system, the agonist force needed to maintain eccentric eye-position increases supralinearly with position, yet the firing rate of individual ocular motoneurons (OMNs) varies linearly with position. However, OMN thresholds (and slopes) vary over a wide range. It has been proposed that such recruitment can help to linearize the oculomotor plant (references in [48]). Results here suggest that this putative mechanism can work in practice.

Finally, the results indicate that transferring learning from cerebellum to brainstem allows the system to compensate for creep with little increase in cerebellar output (figure 7c). In the case of VOR adaptation, where there is good evidence that in particular circumstances a similar transfer occurs [32], modelling indicates that the brainstem can learn new values of VOR gain that allow the system to operate at high frequencies (up to 25 Hz) despite a substantially delayed retinal-slip error signal (approx. 100 ms) [23]. The results here suggest learning transfer may have more generic benefits in stabilizing adaptive control output by ensuring large cerebellar outputs do not affect the stability of the recurrent loop. They provide further computational evidence as to why a powerful computational device such as the adaptive filter model of the cerebellum requires an additional site of plasticity and agree with previous computational predictions that learning occurs first in the cerebellar cortex, before transferring to the brainstem [23].

4.3. Future work

We need to understand how to control DEAs arranged in agonist–antagonist pairs [3,49]. Analysis of the oculomotor system suggests that small changes in conjugate eye-position in the horizontal plane are maintained by the minimum possible change in motor commands (the minimum-norm rule) [22]. It is therefore possible that the control scheme investigated here, which is based on the oculomotor system, could be extended to the optimal control of agonist–antagonist DEA pairs. If so it could be applied generally, and would be of special relevance to the use of EAPs as neuroprostheses [50,51] and as eye muscles for an android robot [52].

Supplementary Material

Data_Cerebellar_Inpsired_Adaptive_Control_for_Nonlinear_Artificial_Muscle

rsif20160547supp1.xlsx^{(20.5MB, xlsx)}

Appendix A. Details of control algorithms

The control algorithms are described here using discrete time notation, where k denotes the time step. Filters are described in discrete time using the notation D(q, γ), where D(q, γ) is a linear discrete time filter, q the shift operator ( Inline graphic ) and γ a vector of filter parameters.

A.1. Linear control

The plant being controlled is described as

A 1

where Inline graphic , x_k is the measured output, u_k the measured input, n the system order and f_o a continuous nonlinear function. We assume that there exists a unique, continuous function inverse , such that

A 2

where Inline graphic is the inverse mapping of f_o and describes a one-to-one mapping from x → u.

The cerebellar element C in figure 3b is modelled as an adaptive filter (figure 2), where the output (z_k) is given as a weighted sum of filtered and optimized input signals. Thus, for time step k

A 3

where Inline graphic w_i,k denotes the ith weight at time step k, and p_i,k denotes the ith parallel fibre at time step k. These weights are adjusted by the error signal e (corresponding to climbing fibre input) according to the LMS learning rule [25].

A 4

where Inline graphic p_k denotes the parallel fibre signals being filtered through reference model filter (see table 2 for the discrete time reference filter definition), and e_k is the sensory error signal, or difference between desired and actual system output .

In the present model, the basis functions implemented by the filters G₁ … G_N are alpha functions (second-order low pass filters with a repeated root), described by a single parameter Inline graphic , where T_i is the time constant of the ith fixed filter (see table 2 for the discrete time alpha filter approximation). These basis functions replace the most commonly used tapped delay line FIR filter and greatly reduce the number of adaptable weights required [53,54]. The output of these filters is denoted g_k. To speed learning, the outputs of these filters g_k are transformed by the fixed matrix Q to give parallel fibre signals p_k

A 5

where Inline graphic and is designed offline to exactly orthonormalize the brainstem output when there is no cerebellar contribution, i.e. z_k = 0 (for further details on the design of Q, see [7]).

A.2. Nonlinear control-adaptive filter

In the nonlinear adaptive filter, the signals being weighted are nonlinear functions of the input signal, and the output is a linear-in-weights combination of these signals. For the linear case, the vector g_k is the output of a bank of fixed, linear filters (figure 3b). Here, we extend this to nonlinear case (figure 4d) and express g_k as

A 6

where f₁ is a nonlinear function of filter outputs, and f₂ is a nonlinear function of filter inputs, n_f is the number of filters and Inline graphic is a fixed discrete time filter, where γ is vector of filter parameters and we call the bank of fixed filters ‘basis functions’, is a discrete bias term. For the case and equation (A 6) reduces to a linear adaptive filter. Here, we do not transform the filter outputs, so trivially Inline graphic . We construct nonlinear basis by thresholding inputs to the linear basis filters such that only motor commands above a certain threshold are input—a range of threshold values as well as the original motor command signal were used (inspired by the suggestion that the granular layer generates threshold-linear processing elements). This nonlinear transformation of inputs can be expressed as

A 7

The input u_k is transformed into a vector that contains u_k as well as thresholded versions of u_k. Inline graphic is the heaviside step function, η is the number of thresholded terms and is a vector of threshold cut of values. Equation (A 7) can be described compactly as , where q_k is a vector of thresholded signals.

A.3. Nonlinear control-brainstem

Figure 4a shows a general Hammerstein model of a plant, and figure 4b shows its nonlinear inverse controller which consists of an LDS (i.e. a fixed linear filter Inline graphic ) followed by an SNL. The output v_k of the fixed linear filter is given as

A 8

The SNL of the brainstem is designed to compensate for the plant nonlinearity (denoted Inline graphic ), assuming there exists a unique, continuous function , that gives the inverse mapping of (see above). Perfect compensation of the nonlinearity is achieved if the SNL in the brainstem equals , and so the brainstem nonlinearity is designed to approximate . Here, we use a series of piecewise linear elements to approximate a continuous nonlinear function (as shown figure 4e and inspired by threshold elements found in the brainstem)

A 9

where m is the number of thresholded, piecewise linear terms, Inline graphic a vector of threshold cut-off values and is the gain of the jth piecewise linear element.

A.4. Linear proportional-integral-derivative control

A linear proportional-integral-derivative controller (PID controller) was also applied to the simulated DEA (see section Control evaluation in appendix). The discrete time PID controller is

graphic file with name rsif20160547-e10.jpg

A 10

where K_p, K_i, K_d are the controller gains, T_d a term used to limit the high-frequency gain of the controller and T_s the sampling period (0.02). The controller parameters (K_p = 1.3, K_i = 3, K_d = 5.3, T_d = 4.7) were estimated as the parameters that minimized the total squared errors over time when controlling the simulated DEA.

A.5. Learning in the brainstem

The gains of the piecewise linear elements can be learnt online, by transferring learning from the cerebellum back to the brainstem. This is done using a Hebbian learning rule, where the gain of the jth piecewise linear element at time step k + 1 for Inline graphic is given as

A 11

where ζ is the learning rate and Inline graphic represents the jth piecewise linear element at time k, i.e. . The additional term at the end of the expression for cases when removes the effect of changes in gains at lower thresholds on the gain at higher thresholds.

A.6. Parameters

The algorithm requires the following parameters to be specified parameters before implementation: rate of error learning (β); rate of brainstem learning (ζ); linear brainstem filter ( Inline graphic ); time constant of reference model filter (τ); number of thresholded terms in the cerebellum (η) and the corresponding cut-off values (); number of alpha filters (n_f), and corresponding time constants (); number of piecewise linear terms in the brainstem (m), and corresponding cut-off values ( Inline graphic ); scale of cerebellar bias ().

Some parameters differed between particular control conditions, whereas others were fixed for all experiments. Parameter values and the initial conditions for each control condition are described in Control evaluation section.

A.7. Control evaluation

The control algorithm was implemented both online in the real system (as described above), and in simulation. In simulation, a previously identified model of the DEA plant was used instead of the physical DEA (details of the model and parameter estimation are provided in [7]). The plant model used to transform an input u_k into an output x_k is described in equations (A 12)–(A 14) (see also figure 4a).

A 12

A 13

A 14

The model parameters (b_k = 0.3, c_k = −0.4, d_k = 0.5, e_k = 2.2) were set to produce similar behaviour to the actual actuator, and adapted each time step (by δ_b = 7 × 10⁻⁸, δ_c = 7 × 10⁻⁶, δ_d = 1.3 × 10⁻⁶, δ_e = 2.3 × 10⁻⁶).

The control algorithm was tested under different conditions by varying the control parameters. The following conditions were tested: linear control with a linear brainstem and linear cerebellum (first scheme); nonlinear control with a linear brainstem and nonlinear cerebellum (second scheme); a PID-based linear controller (third scheme); nonlinear control with a fixed brainstem nonlinearity and linear cerebellum (fourth scheme); nonlinear control with a fixed brainstem nonlinearity and nonlinear cerebellum (fifth scheme); nonlinear control using a nonlinear brainstem with adaptive piecewise linear gains and a nonlinear cerebellum (sixth scheme); all conditions were tested in simulation, and the first and last were also tested on the physical actuator.

Details of the parameters and initial conditions for each experimental case are provided in table 2. In each control experiment, the reference signal r_k was low-pass filtered white noise with frequency range 0–1 Hz.

Authors' contributions

E.D.W. carried out the experiments, data analysis and algorithm design. T.A. and J.M.R. provided the experimental rig and assisted in the experiments, M.J.P., J.P. and S.R.A. assisted with data analysis and algorithm design, P.D. prepared the article and contributed biological background. All authors contributed to the design of the study.

Competing interests

We declare we have no competing interests.

Funding

Preparation of this article was supported by a grant from the EPSRC (EP/I032533/1).

References

1.Rus D, Tolley MT. 2015. Design, fabrication and control of soft robots. Nature 521, 467–475. ( 10.1038/nature14543) [DOI] [PubMed] [Google Scholar]
2.Kim S, Laschi C, Trimmer B. 2013. Soft robotics: a bioinspired evolution in robotics. Trends Biotechnol. 31, 23–30. ( 10.1016/j.tibtech.2013.03.002) [DOI] [PubMed] [Google Scholar]
3.Anderson IA, Gisby TA, McKay TG, O'Brien BM, Calius EP. 2012. Multi-functional dielectric elastomer artificial muscles for soft and smart machines. J. Appl. Phys. 112, 041101 ( 10.1063/1.4740023) [DOI] [Google Scholar]
4.O'Halloran A, O'Malley F, McHugh P. 2008. A review on dielectric elastomer actuators, technology, applications, and challenges. J. Appl. Phys. 104, 071101 ( 10.1063/1.2981642) [DOI] [Google Scholar]
5.Carpi F, Kornbluh R, Sommer-Larsen P, Alici G. 2011. Electroactive polymer actuators as artificial muscles: are they ready for bioinspired applications? Bioinsp. Biomim. 6, 045006 ( 10.1088/1748-3182/6/4/045006) [DOI] [PubMed] [Google Scholar]
6.Wissler M, Mazza E. 2005. Modeling of a pre-strained circular actuator made of dielectric elastomers. Sensors Actuat. A Phys. 120, 184–192. ( 10.1016/j.sna.2004.11.015) [DOI] [Google Scholar]
7.Wilson ED, Assaf T, Pearson MJ, Rossiter JM, Dean P, Anderson SR, Porrill J. 2015. Biohybrid control of general linear systems using the adaptive filter model of cerebellum. Front. Neurorobot. 9, 5. ( 10.3389/fnbot.2015.00005) [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Jacobs WR, Wilson ED, Assaf T, Rossiter J, Dodd TJ, Porrill J, Anderson SR. 2015. Control-focused, nonlinear and time-varying modelling of dielectric elastomer actuators with frequency response analysis. Smart Mater. Struct. 24, 055002 ( 10.1088/0964-1726/24/5/055002) [DOI] [Google Scholar]
9.Druitt CM, Alici G. 2014. Intelligent control of electroactive polymer actuators based on fuzzy and neurofuzzy methodologies. IEEE/ASME Trans. Mechatronics 19, 1951–1962. ( 10.1109/tmech.2013.2293774) [DOI] [Google Scholar]
10.Carpenter RHS. 1988. Movements of the eyes, 2nd edn. London, UK: Pion. [Google Scholar]
11.Skavenski AA, Robinson DA. 1973. Role of abducens neurons in vestibuloocular reflex. J. Neurophysiol. 36, 724–738. [DOI] [PubMed] [Google Scholar]
12.Porrill J, Dean P, Anderson SR. 2013. Adaptive filters and internal models: multilevel description of cerebellar function. Neural Netw. 47, 134–149. ( 10.1016/j.neunet.2012.12.005) [DOI] [PubMed] [Google Scholar]
13.Ito M. 1984. The cerebellum and neural control. New York, NY: Raven Press. [Google Scholar]
14.Dean P, Porrill J, Stone JV. 2002. Decorrelation control by the cerebellum achieves oculomotor plant compensation in simulated vestibulo-ocular reflex. Proc. R. Soc. Lond. B 269, 1895–1904. ( 10.1098/rspb.2002.2103) [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Porrill J, Dean P, Stone JV. 2004. Recurrent cerebellar architecture solves the motor error problem. Proc. R. Soc. Lond. B 271, 789–796. ( 10.1098/rspb.2003.2658) [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Fujita M. 1982. Adaptive filter model of the cerebellum. Biol. Cybern. 45, 195–206. ( 10.1007/BF00336192) [DOI] [PubMed] [Google Scholar]
17.Dean P, Porrill J, Ekerot CF, Jörntell H. 2010. The cerebellar microcircuit as an adaptive filter: experimental and computational evidence. Nat. Rev. Neurosci. 11, 30–43. ( 10.1038/nrn2756) [DOI] [PubMed] [Google Scholar]
18.Widrow B, Walach E. 2008. Adaptive inverse control, reissue edition: a signal processing approach. London, UK: John Wiley & Sons. [Google Scholar]
19.Spanne A, Jorntell H. 2013. Processing of multi-dimensional sensorimotor information in the spinal and cerebellar neuronal circuitry: a new hypothesis. PLoS Comput. Biol. 9, e1002979 ( 10.1371/journal.pcbi.100297) [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Ghez C, Hening W, Gordon J. 1991. Organization of voluntary movement. Curr. Opin. Neurobiol. 1, 664–671. ( 10.1016/S0959-4388(05)80046-7) [DOI] [PubMed] [Google Scholar]
21.Henneman E, Mendell LM. 1981. Functional organization of motoneuron pool and its inputs. In Handbook of physiology, the nervous system, motor control, vol. II, sect. I, part 1 (ed. Brooks VB.), pp. 423–507. Bethesda, MD: American Physiological Society. [Google Scholar]
22.Dean P, Porrill J, Warren PA. 1999. Optimality of static force control by horizontal eye muscles: a test of the minimum norm rule. J. Neurophysiol. 81, 735–757. [DOI] [PubMed] [Google Scholar]
23.Porrill J, Dean P. 2007. Cerebellar motor learning: when is cortical plasticity not enough? PLoS Comput. Biol. 3, 1935–1950. ( 10.1371/journal.pcbi.0030197) [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Webb B. 2002. Robots in invertebrate neuroscience. Nature 417, 359–363. ( 10.1038/417359a) [DOI] [PubMed] [Google Scholar]
25.Widrow B, Stearns SD. 1985. Adaptive signal processing. Englewood Cliffs, NJ: Prentice Hall Inc. [Google Scholar]
26.Haykin S. 2002. Adaptive filter theory, 4th edn Upper Saddle River, NJ: Prentice Hall. [Google Scholar]
27.Coenen OJ-MD, Arnold MP, Sejnowski TJ, Jabri MA. 2001. Parallel fiber coding in the cerebellum for life-long learning. Auton. Robots 11, 291–297. ( 10.1023/A:1012403510221) [DOI] [Google Scholar]
28.Porrill J, Dean P. 2007. Recurrent cerebellar loops simplify adaptive control of redundant and nonlinear motor systems. Neural Comput. 19, 170–193. ( 10.1162/neco.2007.19.1.170) [DOI] [PubMed] [Google Scholar]
29.Deng H, Li HX, Wu YH. 2008. Feedback-linearization-based neural adaptive control for unknown nonaffine nonlinear discrete-time systems. IEEE Trans. Neural Netw. 19, 1615–1625. ( 10.1109/tnn.2008.2000804) [DOI] [PubMed] [Google Scholar]
30.Fuchs AF, Scudder CA, Kaneko CRS. 1988. Discharge patterns and recruitment order of identified motoneurons and internuclear neurons in the monkey abducens nucleus. J. Neurophysiol. 60, 1874–1895. [DOI] [PubMed] [Google Scholar]
31.Dean P. 1996. Motor unit recruitment in a distributed model of extraocular muscle. J. Neurophysiol. 76, 727–742. [DOI] [PubMed] [Google Scholar]
32.Boyden ES, Katoh A, Raymond JL. 2004. Cerebellum-dependent learning: the role of multiple plasticity mechanisms. Annu. Rev. Neurosci. 27, 581–609. ( 10.1146/annurev.neuro.27.070203.144238) [DOI] [PubMed] [Google Scholar]
33.Menzies JRW, Porrill J, Dutia M, Dean P. 2010. Synaptic plasticity in medial vestibular nucleus neurons: comparison with computational requirements of VOR adaptation. PLoS ONE 5, e13182 ( 10.1371/journal.pone.0013182) [DOI] [PMC free article] [PubMed] [Google Scholar]
34.John SW, Alici G, Cook CD. 2010. Inversion-based feedforward control of polypyrrole trilayer bender cctuators. IEEE/ASME Trans. Mechatronics 15, 149–156. ( 10.1109/tmech.2009.2020732) [DOI] [Google Scholar]
35.Hao LN, Li Z. 2010. Modeling and adaptive inverse control of hysteresis and creep in ionic polymer-metal composite actuators. Smart Mater. Struct. 19, 025014 ( 10.1088/0964-1726/19/2/025014) [DOI] [Google Scholar]
36.Ozsecen MY, Mavroidis C. 2010. Nonlinear force control of dielectric electroactive polymer actuators. In Electroactive polymer actuators and devices (ed. BarCohen Y.). Proc. SPIE 7642, 76422C. Bellingham, WA: SPIE. [Google Scholar]
37.Dong R, Tan X. 2012. Modeling and open-loop control of IPMC actuators under changing ambient temperature. Smart Mater. Struct. 21, 065014 ( 10.1088/0964-1726/21/6/065014) [DOI] [Google Scholar]
38.Sarban R, Jones RW. 2012. Physical model-based active vibration control using a dielectric elastomer actuator. J. Intell. Mater. Syst. Struct. 23, 473–483. ( 10.1177/1045389X11435430) [DOI] [Google Scholar]
39.Vunder V, Itik M, Poldsalu I, Punning A, Aabloo A. 2014. Inversion-based control of ionic polymer-metal composite actuators with nanoporous carbon-based electrodes. Smart Mater. Struct. 23, 025010 ( 10.1088/0964-1726/23/2/025010) [DOI] [Google Scholar]
40.Rizzello G, Naso D, York A, Seelecke S. 2015. Modeling, identification, and control of a dielectric electro-active polymer positioning system. IEEE Trans. Control Syst. Technol. 23, 632–643. ( 10.1109/tcst.2014.2338356) [DOI] [Google Scholar]
41.Floreano D, Ijspeert AJ, Schaal S. 2014. Robotics and neuroscience. Curr. Biol. 24, R910–R920. ( 10.1016/j.cub.2014.07.058) [DOI] [PubMed] [Google Scholar]
42.van der Smagt P. 2000. Benchmarking cerebellar control. Robot. Auton. Syst. 32, 237–251. ( 10.1016/S0921-8890(00)00090-7) [DOI] [Google Scholar]
43.Lenz A, Anderson SR, Pipe AG, Melhuish C, Dean P, Porrill J. 2009. Cerebellar inspired adaptive control of a compliant robot actuated by pneumatic artificial muscles. IEEE Trans. Syst. Man Cybern. B 39, 1420–1433. ( 10.1109/TSMCB.2009.2018138) [DOI] [PubMed] [Google Scholar]
44.Luque NR, Garrido JA, Carrillo RR, D'Angelo E, Ros E. 2014. Fast convergence of learning requires plasticity between inferior olive and deep cerebellar nuclei in a manipulation task: a closed-loop robotic simulation. Front. Comput. Neurosci. 8, 97. ( 10.3389/fncom.2014.00097) [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Casellato C, Antonietti A, Garrido JA, Ferrigno G, D'Angelo E, Pedrocchi A. 2015. Distributed cerebellar plasticity implements generalized multiple-scale memory components in real-robot sensorimotor tasks. Front. Comput. Neurosci. 9, 24. ( 10.3389/fncom.2015.00024) [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Yamazaki T, Tanaka S. 2007. The cerebellum as a liquid state machine. Neural Netw. 20, 290–297. ( 10.1016/j.neunet.2007.04.004) [DOI] [PubMed] [Google Scholar]
47.Rössert C, Dean P, Porrill J. 2015. At the edge of chaos: how cerebellar granular layer network dynamics can provide the basis for temporal filters. PLoS Comput. Biol. 11, e1004515 ( 10.1371/journal.pcbi.1004515) [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Anderson SR, Lepora NF, Porrill J, Dean P. 2010. Nonlinear dynamic modelling of isometric force production in primate eye muscle. IEEE Trans. Biomed. Eng. 57, 1554–1567. ( 10.1109/TBME.2010.2044574) [DOI] [PubMed] [Google Scholar]
49.Van Ham R, Sugar TG, Vanderborght B, Hollander KW, Lefeber D. 2009. Compliant actuator designs review of actuators with passive adjustable compliance/controllable stiffness for robotic applications. IEEE Robot. Automat. Mag. 16, 81–94. ( 10.1109/mra.2009.933629) [DOI] [Google Scholar]
50.Senders CW, Tollefson TT, Curtiss S, Wong-Foy A, Prahlad H. 2010. Force requirements for artificial muscle to create an eyelid blink with eyelid sling. Arch. Facial Plast. Surg. 12, 30–36. ( 10.1001/archfacial.2009.111) [DOI] [PubMed] [Google Scholar]
51.Ledgerwood LG, Tinling S, Senders C, Wong-Foy A, Prahlad H, Tollefson TT. 2012. Artificial muscle for reanimation of the paralyzed face. Arch. Facial Plast. Surg. 14, 413–418. ( 10.1001/archfacial.2012.696) [DOI] [PubMed] [Google Scholar]
52.Carpi F, De Rossi D. 2007. Bioinspired actuation of the eyeballs of an android robotic face: concept and preliminary investigations. Bioinsp. Biomim. 2, S50–S63. ( 10.1088/1748-3182/2/2/S06) [DOI] [PubMed] [Google Scholar]
53.Ljung L. 1999. System identification: theory for the user, 2nd edn Upper Saddle River, NJ: Prentice Hall PTR. [Google Scholar]
54.Anderson SR, Pearson MJ, Pipe A, Prescott T, Dean P, Porrill J. 2010. Adaptive cancelation of self-generated sensory signals in a whisking robot. IEEE Trans. Robot. 26, 1065–1076. ( 10.1109/tro.2010.2069990) [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data_Cerebellar_Inpsired_Adaptive_Control_for_Nonlinear_Artificial_Muscle

rsif20160547supp1.xlsx^{(20.5MB, xlsx)}

[RSIF20160547C1] 1.Rus D, Tolley MT. 2015. Design, fabrication and control of soft robots. Nature 521, 467–475. ( 10.1038/nature14543) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C2] 2.Kim S, Laschi C, Trimmer B. 2013. Soft robotics: a bioinspired evolution in robotics. Trends Biotechnol. 31, 23–30. ( 10.1016/j.tibtech.2013.03.002) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C3] 3.Anderson IA, Gisby TA, McKay TG, O'Brien BM, Calius EP. 2012. Multi-functional dielectric elastomer artificial muscles for soft and smart machines. J. Appl. Phys. 112, 041101 ( 10.1063/1.4740023) [DOI] [Google Scholar]

[RSIF20160547C4] 4.O'Halloran A, O'Malley F, McHugh P. 2008. A review on dielectric elastomer actuators, technology, applications, and challenges. J. Appl. Phys. 104, 071101 ( 10.1063/1.2981642) [DOI] [Google Scholar]

[RSIF20160547C5] 5.Carpi F, Kornbluh R, Sommer-Larsen P, Alici G. 2011. Electroactive polymer actuators as artificial muscles: are they ready for bioinspired applications? Bioinsp. Biomim. 6, 045006 ( 10.1088/1748-3182/6/4/045006) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C6] 6.Wissler M, Mazza E. 2005. Modeling of a pre-strained circular actuator made of dielectric elastomers. Sensors Actuat. A Phys. 120, 184–192. ( 10.1016/j.sna.2004.11.015) [DOI] [Google Scholar]

[RSIF20160547C7] 7.Wilson ED, Assaf T, Pearson MJ, Rossiter JM, Dean P, Anderson SR, Porrill J. 2015. Biohybrid control of general linear systems using the adaptive filter model of cerebellum. Front. Neurorobot. 9, 5. ( 10.3389/fnbot.2015.00005) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C8] 8.Jacobs WR, Wilson ED, Assaf T, Rossiter J, Dodd TJ, Porrill J, Anderson SR. 2015. Control-focused, nonlinear and time-varying modelling of dielectric elastomer actuators with frequency response analysis. Smart Mater. Struct. 24, 055002 ( 10.1088/0964-1726/24/5/055002) [DOI] [Google Scholar]

[RSIF20160547C9] 9.Druitt CM, Alici G. 2014. Intelligent control of electroactive polymer actuators based on fuzzy and neurofuzzy methodologies. IEEE/ASME Trans. Mechatronics 19, 1951–1962. ( 10.1109/tmech.2013.2293774) [DOI] [Google Scholar]

[RSIF20160547C10] 10.Carpenter RHS. 1988. Movements of the eyes, 2nd edn. London, UK: Pion. [Google Scholar]

[RSIF20160547C11] 11.Skavenski AA, Robinson DA. 1973. Role of abducens neurons in vestibuloocular reflex. J. Neurophysiol. 36, 724–738. [DOI] [PubMed] [Google Scholar]

[RSIF20160547C12] 12.Porrill J, Dean P, Anderson SR. 2013. Adaptive filters and internal models: multilevel description of cerebellar function. Neural Netw. 47, 134–149. ( 10.1016/j.neunet.2012.12.005) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C13] 13.Ito M. 1984. The cerebellum and neural control. New York, NY: Raven Press. [Google Scholar]

[RSIF20160547C14] 14.Dean P, Porrill J, Stone JV. 2002. Decorrelation control by the cerebellum achieves oculomotor plant compensation in simulated vestibulo-ocular reflex. Proc. R. Soc. Lond. B 269, 1895–1904. ( 10.1098/rspb.2002.2103) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C15] 15.Porrill J, Dean P, Stone JV. 2004. Recurrent cerebellar architecture solves the motor error problem. Proc. R. Soc. Lond. B 271, 789–796. ( 10.1098/rspb.2003.2658) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C16] 16.Fujita M. 1982. Adaptive filter model of the cerebellum. Biol. Cybern. 45, 195–206. ( 10.1007/BF00336192) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C17] 17.Dean P, Porrill J, Ekerot CF, Jörntell H. 2010. The cerebellar microcircuit as an adaptive filter: experimental and computational evidence. Nat. Rev. Neurosci. 11, 30–43. ( 10.1038/nrn2756) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C18] 18.Widrow B, Walach E. 2008. Adaptive inverse control, reissue edition: a signal processing approach. London, UK: John Wiley & Sons. [Google Scholar]

[RSIF20160547C19] 19.Spanne A, Jorntell H. 2013. Processing of multi-dimensional sensorimotor information in the spinal and cerebellar neuronal circuitry: a new hypothesis. PLoS Comput. Biol. 9, e1002979 ( 10.1371/journal.pcbi.100297) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C20] 20.Ghez C, Hening W, Gordon J. 1991. Organization of voluntary movement. Curr. Opin. Neurobiol. 1, 664–671. ( 10.1016/S0959-4388(05)80046-7) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C21] 21.Henneman E, Mendell LM. 1981. Functional organization of motoneuron pool and its inputs. In Handbook of physiology, the nervous system, motor control, vol. II, sect. I, part 1 (ed. Brooks VB.), pp. 423–507. Bethesda, MD: American Physiological Society. [Google Scholar]

[RSIF20160547C22] 22.Dean P, Porrill J, Warren PA. 1999. Optimality of static force control by horizontal eye muscles: a test of the minimum norm rule. J. Neurophysiol. 81, 735–757. [DOI] [PubMed] [Google Scholar]

[RSIF20160547C23] 23.Porrill J, Dean P. 2007. Cerebellar motor learning: when is cortical plasticity not enough? PLoS Comput. Biol. 3, 1935–1950. ( 10.1371/journal.pcbi.0030197) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C24] 24.Webb B. 2002. Robots in invertebrate neuroscience. Nature 417, 359–363. ( 10.1038/417359a) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C25] 25.Widrow B, Stearns SD. 1985. Adaptive signal processing. Englewood Cliffs, NJ: Prentice Hall Inc. [Google Scholar]

[RSIF20160547C26] 26.Haykin S. 2002. Adaptive filter theory, 4th edn Upper Saddle River, NJ: Prentice Hall. [Google Scholar]

[RSIF20160547C27] 27.Coenen OJ-MD, Arnold MP, Sejnowski TJ, Jabri MA. 2001. Parallel fiber coding in the cerebellum for life-long learning. Auton. Robots 11, 291–297. ( 10.1023/A:1012403510221) [DOI] [Google Scholar]

[RSIF20160547C28] 28.Porrill J, Dean P. 2007. Recurrent cerebellar loops simplify adaptive control of redundant and nonlinear motor systems. Neural Comput. 19, 170–193. ( 10.1162/neco.2007.19.1.170) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C29] 29.Deng H, Li HX, Wu YH. 2008. Feedback-linearization-based neural adaptive control for unknown nonaffine nonlinear discrete-time systems. IEEE Trans. Neural Netw. 19, 1615–1625. ( 10.1109/tnn.2008.2000804) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C30] 30.Fuchs AF, Scudder CA, Kaneko CRS. 1988. Discharge patterns and recruitment order of identified motoneurons and internuclear neurons in the monkey abducens nucleus. J. Neurophysiol. 60, 1874–1895. [DOI] [PubMed] [Google Scholar]

[RSIF20160547C31] 31.Dean P. 1996. Motor unit recruitment in a distributed model of extraocular muscle. J. Neurophysiol. 76, 727–742. [DOI] [PubMed] [Google Scholar]

[RSIF20160547C32] 32.Boyden ES, Katoh A, Raymond JL. 2004. Cerebellum-dependent learning: the role of multiple plasticity mechanisms. Annu. Rev. Neurosci. 27, 581–609. ( 10.1146/annurev.neuro.27.070203.144238) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C33] 33.Menzies JRW, Porrill J, Dutia M, Dean P. 2010. Synaptic plasticity in medial vestibular nucleus neurons: comparison with computational requirements of VOR adaptation. PLoS ONE 5, e13182 ( 10.1371/journal.pone.0013182) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C34] 34.John SW, Alici G, Cook CD. 2010. Inversion-based feedforward control of polypyrrole trilayer bender cctuators. IEEE/ASME Trans. Mechatronics 15, 149–156. ( 10.1109/tmech.2009.2020732) [DOI] [Google Scholar]

[RSIF20160547C35] 35.Hao LN, Li Z. 2010. Modeling and adaptive inverse control of hysteresis and creep in ionic polymer-metal composite actuators. Smart Mater. Struct. 19, 025014 ( 10.1088/0964-1726/19/2/025014) [DOI] [Google Scholar]

[RSIF20160547C36] 36.Ozsecen MY, Mavroidis C. 2010. Nonlinear force control of dielectric electroactive polymer actuators. In Electroactive polymer actuators and devices (ed. BarCohen Y.). Proc. SPIE 7642, 76422C. Bellingham, WA: SPIE. [Google Scholar]

[RSIF20160547C37] 37.Dong R, Tan X. 2012. Modeling and open-loop control of IPMC actuators under changing ambient temperature. Smart Mater. Struct. 21, 065014 ( 10.1088/0964-1726/21/6/065014) [DOI] [Google Scholar]

[RSIF20160547C38] 38.Sarban R, Jones RW. 2012. Physical model-based active vibration control using a dielectric elastomer actuator. J. Intell. Mater. Syst. Struct. 23, 473–483. ( 10.1177/1045389X11435430) [DOI] [Google Scholar]

[RSIF20160547C39] 39.Vunder V, Itik M, Poldsalu I, Punning A, Aabloo A. 2014. Inversion-based control of ionic polymer-metal composite actuators with nanoporous carbon-based electrodes. Smart Mater. Struct. 23, 025010 ( 10.1088/0964-1726/23/2/025010) [DOI] [Google Scholar]

[RSIF20160547C40] 40.Rizzello G, Naso D, York A, Seelecke S. 2015. Modeling, identification, and control of a dielectric electro-active polymer positioning system. IEEE Trans. Control Syst. Technol. 23, 632–643. ( 10.1109/tcst.2014.2338356) [DOI] [Google Scholar]

[RSIF20160547C41] 41.Floreano D, Ijspeert AJ, Schaal S. 2014. Robotics and neuroscience. Curr. Biol. 24, R910–R920. ( 10.1016/j.cub.2014.07.058) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C42] 42.van der Smagt P. 2000. Benchmarking cerebellar control. Robot. Auton. Syst. 32, 237–251. ( 10.1016/S0921-8890(00)00090-7) [DOI] [Google Scholar]

[RSIF20160547C43] 43.Lenz A, Anderson SR, Pipe AG, Melhuish C, Dean P, Porrill J. 2009. Cerebellar inspired adaptive control of a compliant robot actuated by pneumatic artificial muscles. IEEE Trans. Syst. Man Cybern. B 39, 1420–1433. ( 10.1109/TSMCB.2009.2018138) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C44] 44.Luque NR, Garrido JA, Carrillo RR, D'Angelo E, Ros E. 2014. Fast convergence of learning requires plasticity between inferior olive and deep cerebellar nuclei in a manipulation task: a closed-loop robotic simulation. Front. Comput. Neurosci. 8, 97. ( 10.3389/fncom.2014.00097) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C45] 45.Casellato C, Antonietti A, Garrido JA, Ferrigno G, D'Angelo E, Pedrocchi A. 2015. Distributed cerebellar plasticity implements generalized multiple-scale memory components in real-robot sensorimotor tasks. Front. Comput. Neurosci. 9, 24. ( 10.3389/fncom.2015.00024) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C46] 46.Yamazaki T, Tanaka S. 2007. The cerebellum as a liquid state machine. Neural Netw. 20, 290–297. ( 10.1016/j.neunet.2007.04.004) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C47] 47.Rössert C, Dean P, Porrill J. 2015. At the edge of chaos: how cerebellar granular layer network dynamics can provide the basis for temporal filters. PLoS Comput. Biol. 11, e1004515 ( 10.1371/journal.pcbi.1004515) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20160547C48] 48.Anderson SR, Lepora NF, Porrill J, Dean P. 2010. Nonlinear dynamic modelling of isometric force production in primate eye muscle. IEEE Trans. Biomed. Eng. 57, 1554–1567. ( 10.1109/TBME.2010.2044574) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C49] 49.Van Ham R, Sugar TG, Vanderborght B, Hollander KW, Lefeber D. 2009. Compliant actuator designs review of actuators with passive adjustable compliance/controllable stiffness for robotic applications. IEEE Robot. Automat. Mag. 16, 81–94. ( 10.1109/mra.2009.933629) [DOI] [Google Scholar]

[RSIF20160547C50] 50.Senders CW, Tollefson TT, Curtiss S, Wong-Foy A, Prahlad H. 2010. Force requirements for artificial muscle to create an eyelid blink with eyelid sling. Arch. Facial Plast. Surg. 12, 30–36. ( 10.1001/archfacial.2009.111) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C51] 51.Ledgerwood LG, Tinling S, Senders C, Wong-Foy A, Prahlad H, Tollefson TT. 2012. Artificial muscle for reanimation of the paralyzed face. Arch. Facial Plast. Surg. 14, 413–418. ( 10.1001/archfacial.2012.696) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C52] 52.Carpi F, De Rossi D. 2007. Bioinspired actuation of the eyeballs of an android robotic face: concept and preliminary investigations. Bioinsp. Biomim. 2, S50–S63. ( 10.1088/1748-3182/2/2/S06) [DOI] [PubMed] [Google Scholar]

[RSIF20160547C53] 53.Ljung L. 1999. System identification: theory for the user, 2nd edn Upper Saddle River, NJ: Prentice Hall PTR. [Google Scholar]

[RSIF20160547C54] 54.Anderson SR, Pearson MJ, Pipe A, Prescott T, Dean P, Porrill J. 2010. Adaptive cancelation of self-generated sensory signals in a whisking robot. IEEE Trans. Robot. 26, 1065–1076. ( 10.1109/tro.2010.2069990) [DOI] [Google Scholar]

PERMALINK

Cerebellar-inspired algorithm for adaptive control of nonlinear dielectric elastomer-based artificial muscle

Emma D Wilson

Tareq Assaf

Martin J Pearson

Jonathan M Rossiter

Sean R Anderson

John Porrill

Paul Dean

Abstract

1. Introduction

Figure 1.

Figure 2.

Figure 3.

2. Methods

2.1. Cerebellum: the adaptive filter model

2.2. Recurrent architecture

2.3. Dealing with nonlinearity

Figure 4.

2.4. Experimental set-up

Figure 5.

Table 1.

2.5. Control schemes

3. Results

Figure 6.

Table 2.

Figure 7.

Figure 8.

4. Discussion

4.1. Electroactive polymer control

4.2. Biological control

4.3. Future work

Supplementary Material

Appendix A. Details of control algorithms

Authors' contributions

Competing interests

Funding

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases