Skip to main content
Frontiers in Neurorobotics logoLink to Frontiers in Neurorobotics
. 2019 Aug 28;13:70. doi: 10.3389/fnbot.2019.00070

A Biomimetic Control Method Increases the Adaptability of a Humanoid Robot Acting in a Dynamic Environment

Marie Claire Capolei 1,*, Emmanouil Angelidis 2, Egidio Falotico 3, Henrik Hautop Lund 1, Silvia Tolu 1,*
PMCID: PMC6722230  PMID: 31555117

Abstract

One of the big challenges in robotics is to endow agents with autonomous and adaptive capabilities. With this purpose, we embedded a cerebellum-based control system into a humanoid robot that becomes capable of handling dynamical external and internal complexity. The cerebellum is the area of the brain that coordinates and predicts the body movements throughout the body-environment interactions. Different biologically plausible cerebellar models are available in literature and have been employed for motor learning and control of simplified objects. We built the canonical cerebellar microcircuit by combining machine learning and computational neuroscience techniques. The control system is composed of the adaptive cerebellar module and a classic control method; their combination allows a fast adaptive learning and robust control of the robotic movements when external disturbances appear. The control structure is built offline, but the dynamic parameters are learned during an online-phase training. The aforementioned adaptive control system has been tested in the Neuro-robotics Platform with the virtual humanoid robot iCub. In the experiment, the robot iCub has to balance with the hand a table with a ball running on it. In contrast with previous attempts of solving this task, the proposed neural controller resulted able to quickly adapt when the internal and external conditions change. Our bio-inspired and flexible control architecture can be applied to different robotic configurations without an excessive tuning of the parameters or customization. The cerebellum-based control system is indeed able to deal with changing dynamics and interactions with the environment. Important insights regarding the relationship between the bio-inspired control system functioning and the complexity of the task to be performed are obtained.

Keywords: biomimetic, cerebellar control, motor learning, humanoid robot, adaptive system, forward model, bio-inspired, neurorobotics

1. Introduction

Controlling a robotic system that operates in an uncertain environment can be a difficult task if the analytical model of the system is not accurate. Models are the most essential tools in robotic control (Francis and Wonham, 1976), however, modeling errors are frequently inevitable in complex robots, for instance humanoids and soft robots. Such redundant modern robots are mechanically complex and often interacts with unstructured dynamical environments (Nakanishi et al., 2008; Nguyen-Tuong et al., 2009). Traditional hand-crafted models and standard physics-based modeling techniques do not sufficiently take into account all the unknown nonlinearities and complexities that these system present. This lack consequentially leads to a reduced tracking accuracy or, in the worst case, to unstable null-space behavior.

Modern autonomous and cognitive robots are requested to adapt not only the decisions but also the forces exerted in any varying condition and environment. The selected movement can not be executed properly if the robot does not adjust the forces according to the changing dynamics. Because of this, modern learning control methods should automatically generate model based on sensor data streams, so that the robot is not a closed entity, but a system that interacts, and evolves through the interaction with a dynamic environment.

In this paper, we intend to design an adaptive learning algorithm to control the movements of a complex nonlinear dynamical system. In particular, we assume that: the Jacobian poorly describes the actual system; the robot interacts with one or more unmodeled external objects; the sensor-actuator system is distributed and not all the states are observable or can be describe with parametric function designed off-line; the action/state space is continuous and high-dimensional. The control system should solve the inverse dynamics control problem of a multiple-joint robotic system affected by static and dynamic external disturbances during the execution of a repeated task. The controller is envisioned to reduce the tracking accuracy of each actuator through force-based control input.

In early days of adaptive self-tuning control, models were learned by fitting open parameters of predefined parametric models (Atkeson et al., 1986; Annaswamy and Narendra, 1989; Wittenmark, 1995; Khalil and Dombre, 2002). Although this method had great success in system identification and adaptive control techniques (Ljung, 2007), the estimation of the open parameters can lead to several problems, such as: slow adaptation; unmodeled behavior and persistent excitation issue (Narendra and Annaswamy, 1987); inconsistency of the estimated physical parameters (Ting et al., 2006); unstable reaction to high estimation error. In recent years, non-parametric approach has been shown to be an efficient tool in the resolution and prevention of the aforementioned problems thanks to the adaptation of the model to the data complexity (Nguyen-Tuong and Peters, 2011), and several methods have been proposed (Farrell and Polycarpou, 2006), such as neural networks (Patino et al., 2002), and statistical methods (Kocijan et al., 2004; Nakanishi and Schaal, 2004; Nakanishi et al., 2005).

In the eighties, Narendra's research group at Yale University exploited the adaptability of artificial neural networks (ANNs) to identify and control nonlinear dynamical systems (Narendra and Mukhopadhyay, 1991a,b, 1997; Narendra and Parthasarathy, 1991). Their experiments showed that the versatility of the ANNs resulted beneficial for controlling the different behaviors that characterize complex dynamical systems. Although the robustness of the classic parametric method in most of the control scenarios, ANNs were largely used in adaptive control to overcome uncertainties, unmodeled nonlinearities and to handle more complex state space systems (Glanz et al., 1991; Sontag, 1992; Zhang et al., 2000; Patino et al., 2002; He et al., 2016, 2018). As matter of fact, the non-linear components and the layered structure that distinguish the ANNs facilitate the mapping and constrain the effects of nonlinearities. Furthermore, the on-line adjustment of the parameters respect to the input-output relationship without any strict structural parameterization results advantageous for adapting to time-dependent changes.

In the Nighties thanks to the extended application of ANNs in robotics, Juyang Weng introduced the Autonomous Mental Development approach (AMD) to artificial intelligence (Weng et al., 1999a; Weng and Hwang, 2006). Weng theories were mainly inspired by how the biological systems efficiently calibrate their movements under internal and environmental changes. Accordingly to AMD the robot have to be embodied in the environment, and its processing is not preprogrammed but is the result of the continuous and real-time interaction within the two systems (Weng et al., 1999b, 2000; Weng, 2002). Respect to classic parametric approaches, the developing artificial agent creates and adapts models describing itself and its relation with the environment rather than learning and estimating parameters of a mathematical model built off-line. These theories found large application for high level cognition tasks (see Vernon et al., 2007 for a review) but were also applied to low level control in visually-guided robots (Metta et al., 1999; Ugur et al., 2015; Luo et al., 2018).

With the aim of mimicking artificially the motor efficiency of the biological system, James S. Albus proposed a neural network-based learning algorithm for robotic controller based on theories of central nervous system (CNS) structure and function: the “cerebellar model articulation controller,” commonly known as CMAC module (Albus, 1972). Several studies in literature demonstrated that, the anatomy and physiology of the cerebellum is suitable for the acquisition, development, storage and use of the internal models describing the interaction within body and environment (Wolpert et al., 1998). Moreover, the cerebellum is composed by separated regions which functionality relies both on the internal structure of the circuit and on the connection with other CNS areas (Houk and Wise, 1995; Caligiore et al., 2017): each region receives both the desired movements from the cortex and the sensory information from tendons, joints and muscles spindles and elaborates a signal that corrects whereas other CNS region are lacking. As matter of fact, subjects affected by cerebellum damage often present motor deficit, such as uncoordinated and ballistic multiple-joint movements (Schmahmann, 2004). For this reason in the last decades, scientists tried to explain the roles of the cerebellum in motor control, especially its contribution to sensory acquisition and timing and its involvement in the prediction of the sensory consequences of action. Moreover, this adaptive control nature motivated several researchers toward a deeper understanding of the cerebellum for robotics application.

Two main research lines born since Marr and Albus proposed the first artificial cerebellum-like network as pattern-classifier for controlling a robotic manipulator (Marr, 1969; Albus, 1972): the first research line focuses on purely industrial application and has as major representative W. Thomas Miller; the second research line, mainly represented by Mitsuo Kawato, deep-rooted in neuroscience and kept investigating on the biological evidence of the cerebellum structure and functionalities in relation to other CNS areas (Kawato et al., 1987; Kawato, 1999).

Miller applied the CMAC module in a closed loop vision-based controller to solve the forward mapping with direct modeling (Miller, 1987). Although the advantages, such as the rapid algorithmic computation based on least-mean-square training and the fast incremental learning, this approach lack of generalization and is sensitive to noise and large error (Miller et al., 1990). Over the years, researchers have been focusing on solving these drawbacks and the CMAC module has been mostly used as non-linear function approximator to boost the tracking accuracy of the adaptive controller and mitigate the effects of the approximation errors (Lin and Chen, 2007; Chen, 2009; Guan et al., 2018; Jiang et al., 2018). Although the promising results obtained by these applications of the CMAC network, this industrial research line did not completely exploit the overall capabilities and components of the cerebellum. It is worthy to note that the CMAC module mimic the cerebellar circuit only at the granular-purkinje level, for this reason only the mapping and classification functionalities are exploited.

The neuroscientific research line has been investigating mainly on the layered structure of the cerebellar circuit proposing several synaptic plasticity models (Luque et al., 2011, 2014, 2016; Casellato et al., 2015; D'Angelo et al., 2016; Antonietti et al., 2017), network models (Chapeau-Blondeau and Chauvet, 1991; Buonomano and Mauk, 1994; Ito, 1997; Mauk and Donegan, 1997; Yamazaki and Tanaka, 2007; Dean et al., 2010), adaptive linear filter model (Fujita, 1982; Barto et al., 1999; Fujiki et al., 2015), and combination of both (Tolu et al., 2012, 2013). These cerebellar-like models were embedded into bio-inspired control architectures to analyze how the cerebellum adjusts the output of the descending motor system of the brain during the generation of movements (Kawato et al., 1987; Ito, 2008), and how it predicts the action, minimizes the sensory discrepancy and cancels the noise (Nowak et al., 2007; Porrill and Dean, 2007). The experiments regarded the generation of voluntary movements with both simulated and real robots, e.g., eye blinking classical conditioning (Antonietti et al., 2017), vestibulo-ocular task (Casellato et al., 2014), the gaze stabilization (Vannucci et al., 2016), and perturbed arm reaching task operating in closed-loop (Garrido Alcazar et al., 2013; Tolu et al., 2013; Luque et al., 2016; Ojeda et al., 2017). From the analysis of the literature, it then emerged that research groups have treated the robots as stand-alone systems without interactions with the environment, while the real world is more complex and every external interaction counts. It is worth mentioning that the previous works have been employed for motor learning and control of simplified objects.

In this paper we present a robotic control architecture to overcome modeling error and to constrain the effects of uncertainties and external disturbances. The proposed controller is composed of a static component based on a classic feedback control methods, and of an adaptive decentralized neural network that mimic the functionality and morphology of the cerebellar circuit. The cerebellar-like module add feed-forward corrective torque to the feedback controller action (Ito, 1984; Miyamoto et al., 1988). A non-parametric nonlinear function approximation algorithm have been employed to map on-line and to reduce the high dimensional and redundant input space. The algorithm creates the internal model describing the interaction within system and environment. This model is kept under development throughout the execution of the task. The neural network mimic the composition of the cerebellar microcircuit. The layered structure of the network constrains the effects of nonlinearities and external perturbations. The network weights are based on non-linear and multidimensional learning rules that mimic the cerebellar synaptic plasticities (Garrido Alcazar et al., 2013; Luque et al., 2014).

This manuscript extends the previous works under three main aspects: 1. cerebellar-like network topology and input data; 2. feedback control-input; 3. dynamic control under external changing conditions. With the aim at giving more insights into the capacity of the cerebellum of generating control terms in the framework of accurate control tasks, the following research questions come naturally to mind: can a control system be generalized to control robotic agents by endowing them with adaptive capabilities? Can accurate and smooth actions in a dynamic environment be performed by the extrapolation of valuable sensory-motor information from heterogeneous dynamical stimuli? Does this sensory-motor information extrapolation facilitate the motor prediction and adaptation in changing conditions? The tests were carried out in the Neuro-robotics Platform (Falotico et al., 2017) with the virtual humanoid robot iCub. The robot arm has to follow a planned movement overcoming the disturbances provoked by a table attached to the hand and a ball running on it. A similar example was solved by employing a conventional control law together with computer vision techniques (Awtar et al., 2002; Levinson et al., 2010). However, this approach assumes a fixed robot morphology defined and described before running the experiment, and there is no run-time adaptation to the “biological changes” as we see in human beings. Balancing a table with a ball running on it is a relevant example of how humans learn to calibrate, coordinate, and adapt their movements, hence, we investigate how robots can achieve this task following the biological approach. Probst et al. (2012) also followed the biological approach; they tackled the problem taking into account the dynamics of the system, four different forces are found by means of a liquid state machine and applied in four different points of the table to achieve the balancing task. A supervised learning rule is used for the training step, which concludes that after 2,500 s no further improvement of the performance is obtained.

Hence, the main advantages of our model are: the low amount of (sometimes implausible) prior information for the control, a fast reactive robotic control system, an on-line self-adaptive learning system. Thanks to these features the robot can perform a determinate physical task and adapt to changing conditions. In conclusion, this approach introduces a fast and flexible control architecture that can be applied to different robotic platforms without any/excessive customization.

In the first section that follows, we present the control architecture, the adopted cerebellar-like model and the description of the method. In the second section, we report the experimental setup as well as the results of the comparison study of four control system approaches including the respective analysis. Finally, we will discuss the main findings of the study correlating them to previous literature.

2. Materials and Methods

In this section, we present our bio-inspired approach to solve the problem of controlling the right arm of the ICub humanoid robot despite the occurrence of an external perturbation. The experiment consists of a simulated humanoid robot that executes a requested movement using three controlled joints of the right arm. During the simulation, a ball is launched on the table that is attached to the robot's right hand; the ball is free to roll on the table, as illustrated in Figure 1B. The movements of the ball are provoked by the shaking of the robot arm and consequentially of the table. The key information about the external system components (e.g., the ball and table) are reported in Table 1.

Figure 1.

Figure 1

(A) The figure illustrates the main components of the functional architecture scheme and the link with the artificial robot agent and the external system. (B) The humanoid Icub holding the table-ball system in the simulation environment NRP. (C) Three controlled joints: wrist prosup ϑ0, wrist yaw ϑ1, wrist pitch ϑ2.

Table 1.

External system features.

Mass [Kg] Volume [m3] Static friction coefficient Dynamic friction coefficient
Ball 0.01 6.54 × 10−5 0.02 0.01
Table top 0.1 9 × 10−4 0.01 0.01

The proposed control architecture (Figure 1A) is composed of three main building blocks: the robotic plant, which is the physical structure (section 2.1); the motor primitive generator, which is responsible of the trajectory generation (section 2.2); the controller, which elaborates the torque commands to move each motor to the desired set point (section 2.3).

2.1. Robotic Plant

The Icub humanoid robot is 104 cm tall and it is equipped with a large variety of sensors (such as gyroscopes, accelerometers, F/T sensors, encoders, two digital cameras) and 53 actuated joints that move the waist, head, eyes, legs, arms, and hands. During the experimental tests, eight revolute joints of the right arm were actuated: four joints were kept constant to maintain the arm up (e.g., elbow, shoulder roll, shoulder yaw, and shoulder pitch), and three joints were controlled in effort by the proposed control system (namely wrist prosup, wrist yaw and wrist pitch). The axis orientation of the controlled actuators are illustrated in Figure 1C. Additional information about the actuated joints are reported in Table 2. In this work, we used the encoder to only read the state of the controlled joints (e.g., angular position, and velocity) and save it in the process variables,

Table 2.

Actuated joints information: the wrist actuators (highlighted in yellow) are controlled in effort while the elbow and shoulder motors are kept to a constant angular position.

Min ϑ[rad] Max ϑ[rad] Max ϑ˙ [rad · sec−1] Control Value
Wrist prosup n = 0 –0.8726 0.8726 100 Effort Controlled variable
Wrist yaw n = 1 –0.4363 0.4363 100 Effort Controlled variable
Wrist pitch n = 2 –1.1344 0.1745 100 Effort Controlled variable
Elbow 0.0959 1.8500 100 Position Constant = 1.14 [rad]
Shoulder roll 0.0000 2.80649 100 Position Constant = 0.1 [rad]
Shoulder yaw –0.645772 1.74533 100 Position Constant = –0.1 [rad]
Shoulder pitch –1.65806 0.0872665 100 Position Constant = –0.9 [rad]
QN×1c(t)=[ϑc,0(t)ϑc,N(t)] where N=2, (1)
QN×1c(t)=[ϑc,0(t)ϑc,N(t)] where N=2, (2)

2.2. Motor Primitive Generator

The motor primitive generator plans the trajectory for each actuated joint and communicates the reference value to the control system at each time step. The reference angular position and velocity of each joint are defined as oscillators with fixed amplitude, natural frequency and phase,

QN×1r(t)=[ϑr,0(t)ϑr,N(t)]=[A0·sin(2πft+φ0)AN·sin(2πft+φN)], (3)
QN×1r(t)=[ϑr,0(t)ϑr,N(t)]=[2πfA0·cos(2πft+φ0)2πfAN·cos(2πft+φN)], (4)

where N = 2. The temporal frequency is f = 0.25Hz, while the oscillations A amplitude and φ phase of each joint are set to:

A1×N=[A0, A1, A2]=[0.1727, 0.1363, 0.0345] radφ1×N=[φ0, φ1, φ2]=[0.5π, 0.5π, 0.0] rad.

2.3. Controller

The controller block (Figure 1A) is composed of a static component based on classic control methods (section 2.3.1), and of an adaptive decentralized block representing the bio-inspired regulator, i.e., the cerebellar-like circuit (section 2.3.2). Both sub-blocks receive information about the Qc, Qc process variables measured from the encoders located in the robotic plant (Equations 1, 2), and the Qr,Qr reference trajectory signals from the motor primitive generator (Equations 3, 4). The controller directly sends the τtot total control input to the robot servo controller which actuates the joints for δt = 0.5s. The τtot total control input is expressed as the result of a feed-forward compensation (as the AFEL architecture proposed by Tolu et al., 2012),

τN×1tot=[τ0totτNtot]=[τ0PID+Δτ0DCNτNPID+ΔτNDCN], (5)

τtot where τnPID and ΔτnDCN (where n = 0, …, N) are the contributions from the static and the adaptive bio-inspired controller respectively.

2.3.1. Feedback Controller

The static control system refers to the classic feedback control scheme with PID regulator. It is defined static due to its time-constant control terms. The closed-loop system continuously computes the eϑn angular velocity error of each joint as the difference between the ϑr,n reference (Equation 4) and the ϑc,n process variable (Equation 2),

eN×1vel=[eϑ0eϑN]=[ϑr,0-ϑc,0ϑr,N-ϑc,N]. (6)

The eϑn error (where n = 0, …, N) is used to apply correction to each controlled joint in terms of effort,

τN×1PID=[τ0PID,  , τNPID]T, (7)

according to the independent joint control law expressed as:

τnPID(t)=KP,n·eϑn+KI,n·t-Δtteϑn(t)dt+KD,n·dϑn(t)dt        for n=0,  ,N ,   (8)

where the integration time window is Δt = 10 samples. The regulator is tuned to weakly operate in a linearized condition which excludes the presence and disturbance of the ball, hence the proportional, integrative and derivative terms are static and set respectively to,

KP=[KP,0, KP,1, KP,2]=[2.9000, 2.3000, 2.3500]KI=[KI,0, KI,1, KI,2]=[1.9400, 1.9000, 1.9000]KD=[KD,0, KD,1, KD,2]=[0.0050, 0.0001, 0.0004].

2.3.2. Cerebellar-Like Model

The proposed cerebellar-like network has been designed to solve robotic problems (Figure 2). In particular, the sensory input and the corrective action in output refer to entities regarding the actuated motors, such as motor angular position, velocity or effort. Electrophysiological evidence about the encoding of movement kinematics has been found at all levels of the cerebellum; for example, in this review (Ebner et al., 2011), reported that the mossy fibers (MF) inputs encode the position, direction, and velocity of limb movements. Moreover, many hypotheses suggest that the cerebellum directly contributes to the motor command required to produce a movement. In our model, the input-output relationship is based on the previous suggestions and the signal propagation throughout the cerebellar network layers is in accordance with the robotic control application. The main design concept is that the signal propagating inside the circuit have the same dimension of the ΔτDCN output signal from the Deep Cerebellar Nuclei (DCN). The propagated signal is modulated inside the network by other signals that are correlated with the intrinsic features of the controlled plant, such as position and velocity terms, in order to have a complete description of the state.

Figure 2.

Figure 2

Proposed cerebellar-like circuit in analogy with D'Angelo et al. (2016). (A) canonical micro-circuit. Proposed cerebellar-like neural network (B) structural partition and (C) details.

The neural network structure is divided into separated modules (Figure 2B), or namely Unit Learning Machine (uml) (Tolu et al., 2012, 2013). Assuming that the robot plant is composed by N controllable object, then each uml is specialized on the n-th controlled object (where n = 0, …, N), or rather the DCN output of the uml will be the cerebellar contribution for the specific object. The uml itself is separated into M sub-modules which represent the canonical cerebellar microcircuit (ccm). Each ccm is specialized with respect to a specific feature describing the behavior of the n-th controlled object. The overall umls and other structures, that are dedicated to the dimensionality reduction and mapping of the sensory information, compose together the Modular Cerebellar Circuit (MCC).

In the proposed experiment, the canonical cerebellar microcircuits (ccm) of each controlled object are specialized in p position and in v velocity. In details, the Purkinje layer of each n−th uml presents a pair of Purkinje cells (PC) (Figure 2C), specialized in position Pcn, p and velocity Pcn, v respectively through different climbing fibers (ion, p, and ion, v). Moreover, the bio-inspired controller receives the same sensory information of the feedback controller (section 2.3.1), but it is intended to correct the eϑn angular position error, whereas the PID corrects the eϑn angular velocity error. This is solved through the connection inferior olive-deep cerebellar nuclei (IO-DCN), which conveys information about the angular position error. An additional aspect, the inferior olive signals differs from Kawato's feedback error learning theory (Kawato, 1990) and our previous experiments (Tolu et al., 2012, 2013), because the Jacobian does not correctly approximate the system, therefore the required conditions are not satisfied and it is not efficient to compare the motor signals.

The mossy fibers transmit the information about the current and reference state of the controlled joints in terms of angular velocity to the granular cells (Gr),

MF2N×1(t)=[mf0(t)mf2N(t)]=[QN×1r(t)QN×1c(t)]=[ϑr,0(t)ϑr,N(t)ϑc,0(t)ϑc,N(t)]. (9)

The granular layer-parallel fibers network is the circuit area committed to the mapping of the mossy fibers signals and to the prediction of the next output given the current sensory input (Marr, 1969; Albus, 1971). As in our previous works (Tolu et al., 2012, 2013), we artificially represented this network with the Locally Weighted Projection Regression algorithm (LWPR) (Vijayakumar and Schaal, 2000). The LWPR resulted an efficient method for the fast on-line approximation of non-linear functions in high dimensional spaces. Given the MF(t) mossy fibers input vector (Equation 9), the LWPR creates G local linear models that in our scheme represent the Grg granular cells (for g = 0, …, G). Each linear model employs the MF(t) to make a τ^n,ggr(t) prediction of the control input τntot(t-1) (where n=1,…,N). The total output of the granular-parallel fibers network is the weighted mean of all the linear models specialized in velocity,

τ^nPF(t)=g=1g=Gwn,ggr(t)·τ^n,ggr(t)g=1g=Gwn,ggr(t) for n=1,,N, (10)

where wn,ggr and τ^n,ggr are defined in Vijayakumar and Schaal (2000).

In our scheme, there are two Purkinje cells per controlled joint Pcn, p and Pcn, v (where n = 0, …, N). The wn,ppf-pc1 synapses connecting the parallel fibers and the Pcn, p (PF-PC connection) (Garrido Alcazar et al., 2013), are modulated by the ion, p inferior olive (IO) signal,

ion,p(t)=ϑn(t), (11)

that transmits the information about the ẽϑn normalized angular position error of the n−th joint,

eϑn(t)=ϑr,n(t)-ϑc,n(t), (12)

while the wn,vpf-pc 1 synaptic strengths between the parallel fibers and the Pcn, v, are modulated by the ion, v inferior olive signal,

ion,v(t)=ϑn(t), (13)

that transmits the information about the ϑn normalized angular velocity error of the n−th joint (Equation 6). The wpf-pc(t,io0(t)) weighting kernel tends to support the control actions that lead to an error lower than a specific threshold ethresh,

eϑthresh,pc=[eϑ0thresh,pceϑNthresh,pc]=[w0,ppf-pc(t,io0p(t)=0)·max(eϑ0)wN,ppf-pc(t,ioN,p(t)=0)·max(eϑN)] =[0.0120.0080.002][rad], (14)
eϑthresh,pc=[eϑ0thresh,pceϑNthresh,pc]=[w0,vpf-pc(t,io0v(t)=0)·max(eϑ0)wN,vpf-pc(t,ioN,v(t)=0)·max(eϑN)] =[0.0120.0080.002][rad·sec-1]. (15)

Respect to our previous work (Tolu et al., 2012, 2013) the output signals of the Purkinje cells are directly function of the τ^nPF(t) prediction instead of the wn,ggr weights,

τn,pPC(t)=wn,ppf-pc(t,ion,p(t))·τ^nPF(t) (16)
τn,vPC(t)=wn,vpf-pc(t,ion,v(t))·τ^nPF(t). (17)

Afterwards, the τn,pPC(t) τn,vPC(t) Purkinje cells signals are scaled by the synaptic weights wn,ppc-dcn and wn,vpc-dcn 2 (Garrido Alcazar et al., 2013), that are modulated by the Purkinje cells and the deep cerebellar nuclei activities (PC-DCN),

wn,ppc-dcn=f(t,τn,pPC(t),ΔτnDCN(t-1)), (18)
wn,vpc-dcn=f(t,τn,vPC(t),ΔτnDCN(t-1)). (19)

resulting in the input signals,

τn,pPC-DCN(t)=wn,ppc-dcn·τn,pPC(t) (20)
τn,vPC-DCN(t)=wn,vpc-dcn·τn,vPC(t). (21)

In addition, the deep cerebellar nuclei receives the input signals τn,pMF-DCN, τn,vMF-DCN from the mossy fibers and τn,pIO-DCN from the inferior olive. In our proposed circuit, the mossy fibers connected to the deep cerebellar nuclei (MF-DCN) conveys the information about the τntot(t-1) last control input sent to each controlled joint (Equation 5). This input is scaled by the synaptic weights wn,pmf-dcn and wn,vmf-dcn 3 (Garrido Alcazar et al., 2013), modulated by the respective n−th Purkinje cells activities,

τn,pMF-DCN(t)=wn,pmf-dcn(t,τn,pPC(t))·τntot(t-1), (22)
τn,vMF-DCN(t)=wn,vmf-dcn(t,τn,vPC(t))·τntot(t-1). (23)

The τn,pIO-DCN inferior olive contribution in the deep cerebellar nuclei (IO-DCN) is given by the ion, p (Equation 11), which is modulated by the wn,pio-dcn 4 synaptic weight (Luque et al., 2014),

τn,pIO-DCN=wn,pio-dcn(t,ion,p(t))·ion,p(t). (24)

The final ΔτnDCN cerebellar corrective term is the result of the τnMF-DCN modulated control input subtracted by the τnPC-DCN prediction modulated by the current error together with the τn,pIO-DCN modulated contribution of the error itself,

ΔτnDCN=(τn,pMF-DCN+τn,vMF-DCN)-(τn,pPC-DCN+τn,vPC-DCN) +τn,pIO-DCN, (25)

or rather,

ΔτnDCN=(τn(ϑn,τtot)+τn(ϑn,τtot))-(τ^ntot(eϑn)+τ^ntot(eϑ)) +τn(eϑn).

2.4. Proposed Experiments and Performance Measures

The proposed control scheme has been applied in four different experiments with the aim at analyzing the advantages of the bio-inspired controller in presence of dynamical disturbances. In details, the four experiments differ from the presence of the ball and the cerebellar-like controller contribution (Figure 3):

Figure 3.

Figure 3

Functional architectures representing the proposed experiments.

  • Experiment I: control input without both cerebellum contribution and ball disturbance,
    τtot=τPID  (no ball); (26)
  • Experiment II: control input with cerebellum contribution, without ball disturbance,
    τtot=τPID+ΔτDCN  (no ball); (27)
  • Experiment III: control input without cerebellum contribution, with ball disturbance,
    τtot=τPID  (ball); (28)
  • Experiment IV: control input with both cerebellum contribution and ball disturbance,
    τtot=τPID+ΔτDCN  (ball). (29)

The performance of each experiment will be measured by analysis of the mean absolute error (MAE) evolution computed for the angular position error of each controlled joint (Equation 12),

maeϑn(k)=i=tt+T|eϑn(i)|T for n=0,,N. (30)

The MAE is computed for every trajectory period T = 8 s (Equation 3).

3. Results

The software describing the system is based on the ROS (Quigley et al., 2009) messaging architecture and is integrated in the Neurorobotics Platform (NRP) (Falotico et al., 2017). The NRP is a simulation environment based on ROS and Gazebo (Koenig and Howard, 2004) which includes a variety of robots, environments and a detailed physics simulator. The three wrist motors are controlled in effort through the Gazebo service ApplyJointEffort, while the elbow and the three shoulder motors are controlled in position through their specific ROS topic. The sensory information from the encoders are received with a sampling frequency of fsampl = 50 Hz. The computer used for the test has the Ubuntu 16.04 Operating system (OS type 64−bit), the Intel Core™ i7 − 7700HQ CPU@2.80GHz × 8 processor, and the GeForce GTX 1050/PCIe/SSE2 graphics card.

Each experiment was performed 20 times with a total duration of about 3 min. The recorded data was saved in.csv files and processed for the analysis. The results are expressed as mean value of the 20 tests, and σ standard deviation or 95% confidence interval. In each experiment, the cerebellar-like circuit is activated after t = 40 s (or 10th iteration), which is the moment all the actuated joints reach a stable configuration (included the shoulder joints and the elbow). In experiments II and IV, the ball is launched on the table after t = 5 s (purple vertical line in the figures).

The comparison of the 4 experiments for each controlled joint are presented separately in 3 parts. In each part, we analyze the joint states, i.e., ϑc, n(t) angular position and ϑc,n(t) velocity (Figures 4, 6, 8), respect to the control action (Figures 5, 7, 9). Moreover, we compared the mean absolute error MAE to measure the performance of the different cases (as reported in Table 3 and illustrated in Figure 10).

Figure 4.

Figure 4

Angular position and velocity wrist prosup: comparison experiment I and II (A), with zoom on the angular position (C); comparison experiment I and II (B), with zoom on the angular position (D). The plots show the results of the 20 tests in terms of mean value (solid line) and 95% confidence interval (colored area). The vertical green line indicates the moment in which the cerebellar-like controller starts giving the corrective action (t = 40s). The vertical purple line indicates the instant the ball is launched on the table (t = 5s).

Figure 5.

Figure 5

Wrist prosup experimental results. Resulting angular position error eϑ0, comparison experiments I and II (A), comparison experiments III IV (B). Control input τ0tot evolution, comparison experiments I and II (C), comparison experiments III IV (D). Control input contributions in experiment IV comparisons between: τ0tot and τ0PID (E); τ0tot and τ0DCN (F). The plots show the results of the 20 tests in terms of mean value (solid line) and 95% confidence interval (colored area). The vertical green line indicates the moment the cerebellar-like controller starts giving the corrective action (t = 40s). The vertical purple line indicates the instant the ball is launched on the table (t = 5s).

Table 3.

The mean absolute error (MAE) of the initial and final period (T = 4 s).

Ball No ball
Cerebellum No cerebellum Cerebellum No cerebellum
Initial Final Initial Final Initial Final Initial Final
maeϑ0 [rad] μ 0.4136 0.0241 0.4148 0.1104 0.4068 0.0188 0.4136 0.1037
σ 0.0181 0.0085 0.0273 0.0242 0.0120 0.0017 0.0319 0.0075
maeϑ1 [rad] μ 0.3148 0.0689 0.3168 0.3123 0.3139 0.0671 0.3172 0.3130
σ 0.0048 0.0093 0.0070 0.0018 0.0012 0.0031 0.0016 0.0005
maeϑ2 [rad] μ 0.4380 0.0042 0.4395 0.0019 0.4437 0.0037 0.4401 0.0020
σ 0.0045 0.0012 0.0063 0.0001 0.0026 0.0003 0.0013 2.9177e-05

The results express the mean value μ and standard deviation σ of the 20 tests run for the four experiments.

3.1. Wrist Prosup

In the details of Figures 4A,B, the corrective action of the cerebellar-like circuit (Experiments II, IV) leads ϑc, 0 faster to the desired trajectory ϑr, 0 with respect to the case without corrections (Experiments I, III). ϑc, 0(t) starts getting closer to the desired position in about one period T = 4 s after the activation of the cerebellum (Figures 4C,D). In Figures 5A,B it is evident how the angular position error eϑ0 drops when the cerebellum action grows (Figures 5C,D). In particular, the mean absolute error drastically decreased by the 95 and 94% in experiment II and IV respectively, while it only decreased by the 74 and 73% in Experiment I and III (Figure 10A, the numerical results are reported in Table 3). The main difference between experiments with and without ball is the σ standard deviation. In the final period, the experiments with the ball present a larger standard deviation which is 30% (without cerebellum) and 19% (with cerebellum) respect to the NO ball-case.

Figure 10.

Figure 10

Comparison of the angular position MAE: (A) wrist prosup, (B) wrist yaw and (C) wrist pitch. The plots show the results of the 20 tests in terms of mean value (solid line) and 95% confidence interval (colored area). The vertical green line indicates the moment the cerebellar-like controller starts providing the corrective action (t = 40s or iteration = 10).

3.2. Wrist Yaw

The wrist yaw joint is the most affected by the cerebellum action. In Figure 6, it is evident how with only the PID contribution ϑc, 1(t) presents a constant and large offset with respect to ϑr, 1(t). As soon as the cerebellum contribution Δτ1DCN grows (around the 50 s, Figures 7C,D) the error descends (Figures 7A,B). The mean absolute error decreases by the 78% in experiment II and IV, while it only drops 1% in experiments I and III (Figure 10B). In the last period, the experiments with the ball have a standard deviation 30–33% larger than the NO ball-cases.

Figure 6.

Figure 6

Angular position and velocity wrist yaw: comparison experiment I and II (a), with zoom on the angular position (c); comparison experiment I and II (b), with zoom on the angular position (d). The plots show the results of the 20 tests in terms of mean value (solid line) and 95% confidence interval (colored area). The vertical green line indicates the moment the cerebellar-like controller starts giving the corrective action (t = 40s). The vertical purple line indicates the instant the ball is launched on the table (t = 5s).

Figure 7.

Figure 7

Wrist yaw experimental results. Resulting angular position error eϑ1, comparison experiments I and II (A), comparison experiments III IV (B). The τ1tot control input evolution, comparison experiments I and II (C), comparison experiments III IV (D). Control input contributions in experiment IV comparisons between: τ1tot and τ1PID (E); τ1tot and τ1DCN (F).The plots show the results of the 20 tests in terms of mean value (solid line) and 95% confidence interval (colored area). The vertical green line indicates the moment the cerebellar-like controller starts providing the corrective action (t = 40s). The vertical purple line indicates the instant the ball is launched on the table (t = 5s).

3.3. Wrist Pitch

On the other hand, the wrist pitch gains from the cerebellar action only when the error is larger than eϑ2thresh, which is around 40–60 s (Figure 8), taking into account that the cerebellum is started at t = 40 s. The Δτ1DCN gets more silent (Figures 9C–E) when the angular position error is small (Figures 9A,B). In Figure 10C is more evident how the cerebellum accelerates the corrective action between iteration 10 and 15 where the MAE with the cerebellum (experiment II) is 17% lower respect to experiment I (in experiment IV the MAE is 16% lower respect to experiment III).

Figure 8.

Figure 8

Angular position and velocity wrist pitch: comparison experiment I and II (A), with zoom on the angular position (C); comparison experiment I and II (B), with zoom on the angular position (D). The plots show the results of the 20 tests in terms of mean value (solid line) and 95% confidence interval (colored area). The vertical green line indicates the moment the cerebellar-like controller starts providing the corrective action (t = 40s). The vertical purple line indicates the instant the ball is launched on the table (t = 5s).

Figure 9.

Figure 9

Wrist pitch experimental results. Resulting angular position error eϑ2, comparison experiments I and II (A), comparison experiments III IV (B). The τ2tot control input evolution, comparison experiments I and II (C), comparison experiments III IV (D). Control input contributions in experiment IV comparisons between: τ2tot and τ2PID (E); τ2tot and τ2DCN (F). The plots show the results of the 20 tests in terms of mean value (solid line) and 95% confidence interval (colored area). The vertical green line indicates the moment the cerebellar-like controller starts providing the corrective action (t = 40s). The vertical purple line indicates the instant the ball is launched on the table (t = 5s).

4. Discussions

In this work, a bio-mimetic control scheme is presented in the framework of a robotic task, in which simultaneous control of the object dynamics and of the internal force exerted by the robot arm to follow a trajectory with the object attached to it is required. To address multi-joint corrective responses, we induced and combined three-joint wrist motions. Thus adaptation skills are required especially to deal with an external perturbation acting on the robot-object system. The main observation is that plastic mechanisms given by a feed-forward cerebellum-like controller effectively contribute to the learning of the dynamics model of the robot arm-object system and to the adaptive corrections in terms of torque commands applied to the joints. These cerebellar torque contributions, together with feedback (PID) torque outcome, allow the progressive error reduction by incorporating distributed synaptic plasticity based on the feedback from the actual movement.

The results about the three controlled joints showed a fast reactive control in the test cases when the cerebellum-like model is active, which is even more evident when the ball (random perturbation) is present as shown in Figures 4, 6, 8B,D. An incremental velocity control input is then provided to the controller of the system to deal with the perturbation. The purpose of considering a heterogeneous stochastic dynamical stimuli (board and ball) was to test and examine the activation of incremental learning and adaptation of the cerebellum-like controller and at the same time to confirm its coupling with the feedback control inputs. Previous studies have shown that the feedback processes are omnipresent in voluntary motor actions (Scott et al., 2015) and rapid corrective responses occur even for very small disturbances that approach the natural variability of limb motion. In human beings, these corrections commonly require increases in muscle activity generated i.e., by applied loads (Nashed et al., 2015). By analogy, a similar effect can be noticed at joint-level in our system. In the experimental situation, the joints that are more influenced by the limb dynamics (wrist prosup and yaw joints) under the effect of the table and ball increase their control input activity as represented in Figures 5, 7C,D, while the wrist pitch joint has a much more reduced activity re influenced by the limb dynamics (wrist prosup and yaw joints) under the effect of the table and ball increase their control input activity as represented in Figures 9C,D compared to the previous two joints. This phenomena is also reflected in the control input provided by the cerebellum-like model. The bigger the position error is at the beginning of the simulation with only the PID control case (experiments I and III) the more effective the cerebellar-like corrections are (experiments II and IV) as shown in Figures 5, 7, 9A,B. It should be noted that for the wrist pitch joint the PID controller leads to ~0.0 (rad) MAE around 40 s from the beginning of the simulation. However, among all the joints, the fundamental role of the cerebellum in motor control is confirmed by its anticipatory response for decreasing the error as it is appreciated in Figure 10. The control system achieved these result by creating up to 9 Gr receptive fields per uml at the granular level (or rather LWPR). In Figure 11, it is possible to appreciate how the IO inferior olive signals (in blue) of each ccm promptly influence the synaptic weights (in red) between the PF parallel fibers and the PC Purkinje cells (left column), and the contribution of the inferior olive itself on the DCN Deep Cerebellar nuclei corrective action (right column). In the IO-DCN connection details, the synaptic weights rapidly increment in the first tract around 40–60 s where the error is higher and then keep increasing slowly for the final adjustments. On the other hand, the PF-PC connection tends to not over-react at the beginning of the simulation around 40–60 s, while it strengthen when the error decline. We assume that this opposite influence of the IO on the synaptic weights makes possible the filtering and the dumping of any external disturbances or high error.

Figure 11.

Figure 11

Learning evolution of the cerebellar-like network in experiment IV: influence of the inferior olive on the PC-PF parallel fibers-Purkinje cells and IO-DCN inferior olive-Deep cerebellar nuclei connections. The plots show the results of the 20 tests in terms of mean value (solid line) and 95% confidence interval (colored area).

This control model proposes a plausible explanation on how control feedback is used by the central nervous system (CNS) to correct for intrinsic as well as external sources of disturbances. Furthermore, the bio-mimetic model represents a plausible control scheme for voluntary movements that can be generalized to control robotic agents without mayor tuning of the parameters. Our controller with distributed plasticity allows efficient adjustment of the corrective signal regardless of the dynamic features of the robot arm and of the way the added perturbations affect the dynamics of the arm plant involved. According to this, the controller (cerebellum-like and PID) is adaptable by providing adjustable torque commands among the joints to overcome external dynamic and stochastic perturbations and to have a both fast and precise movement. This replies to our question about if the sensory-motor information extrapolation made by the cerebellum-like facilitates motor prediction and adaptation in changing conditions. It should be noted that the adaptation mechanism adopted here is not constrained to any specific plant or testing framework, and could therefore be extrapolated to other common testing paradigms.

D'Angelo et al. (2016) illustrated in their paper the schematic representation of how the core cerebellar microcircuit is wired inside the whole brain. The proposed cerebellar-like model has been designed in analogy with it. In contrast with Garrido Alcazar et al. (2013), Casellato et al. (2014), Antonietti et al. (2017), the proposed model encodes the movement kinematics at the mossy fibers level (Ebner et al., 2011), and presents a coupling at the Purkinje layer for velocity and position terms representation. Likewise, the synaptic strengths at PC-DCN level as well the synaptic strengths at IO-DCN level are modulated by signals related to position or velocity. The mossy fibers are connected to the DCN and to some granular cells to convey the efference copy or motor command information. The IO cells are devoted to teaching signal error transmission in terms of position and velocity errors. The teaching errors modulate the synaptic strengths at PF-PC and IO-DCN levels.

Tokuda et al. (2017) postulated that high dimensionality problem (high-dimensional sensory-motor inputs vs. low training data) is accomplished by the cerebellum by regulating the synchronous firing activities of the inferior olive (IO) neurons. Though the implementation of coupling mechanisms at the inferior olive cells would be an interesting work to have a better explanation on multiple joint control. This extension could also provide additional insights into the internal connectivity of the cerebellar microcomplex. Further investigation will be possible in the future of how specific properties of the cells, of the network topology and synaptic adaptation mechanisms complement each other in the bio-inspired architecture.

4.1. Neural Basis of Feedback Control for Voluntary Movements

Feedback control of movement is essential to guarantee movement success especially to compensate for perturbation arising from the interaction with the external world. Different brain areas (primary motor cortex, primary somatosensory cortex, cerebellum, supplementary motor area, etc.) are involved during a voluntary movement and cooperate in many levels of hierarchy. Feedback control theory might be the key for understanding how the previous areas plan and control the movement hierarchically. By using control terminology, during the voluntary movement of a limb, the primary motor cortex acts as a controller, and the limb connected to neuronal circuits becomes the controlled object.

The cerebellum learns and provides the internal models that reproduce the inverse or direct dynamics of the body part. Thanks to the cerebellar internal model learning, the primary motor cortex performs the control without an external feedback (Koziol et al., 2014). By our simulations, we suggest that such behavior can be confirmed. Indeed, the cerebellar-like contributions drive the feedback controller toward better accuracy and precision of the movement. In the future, a visual feedback input will be considered to probe the sophistication of feedback control processing and cerebellar-like learning consolidation.

Author Contributions

MC and ST conceived and designed the experiments, analyzed the data, and wrote the paper. MC implemented the architecture and performed the experiments. EA contributed to materials and analysis tools. HL and EF reviewed the paper.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

1wpfpc weighting kernel parameters: LTDmax = 10−3, LTPmax = 10−3, α = 170.

2wpcdcn weighting kernel parameters: LTDmax = 10−4, LTPmax = 10−4, α = 2.

3wmfdcn weighting kernel parameters: LTDmax = 10−4, LTPmax = 10−4, α = 2.

4wn,pio-dcn weighting kernel parameters: MTDmax = −10−4, MTPmax = −10−5, α = 100.

Funding. This work has received funding from the Marie Curie project No. 705100 (Biomodular), from the EU-H2020 Framework Programme for Research and Innovation under the specific grant agreement No. 720270 (Human Brain Project SGA1), and No. 785907 (Human Brain Project SGA2).

References

  1. Albus J. S. (1971). A theory of cerebellar function. Math. Biosci. 10, 25–61. 10.1016/0025-5564(71)90051-4 [DOI] [Google Scholar]
  2. Albus J. S. (1972). Theoretical and experimental aspects of a cerebellar model. Dissert. Abstr. Int. 33. [Google Scholar]
  3. Annaswamy A. M., Narendra K. S. (1989). Adaptive control of simple time-varying systems, in Proceedings of the 28th IEEE Conference on Decision and Control (Tampa, FL: IEEE; ), 1014–1018. [Google Scholar]
  4. Antonietti A., Casellato C., D'Angelo E., Pedrocchi A. (2017). Model-driven analysis of eyeblink classical conditioning reveals the underlying structure of cerebellar plasticity and neuronal activity. IEEE Trans. Neural Netw. Learn. Syst. 28, 2748–2762. 10.1109/TNNLS.2016.2598190 [DOI] [PubMed] [Google Scholar]
  5. Atkeson C. G., An C. H., Hollerbach J. M. (1986). Estimation of inertial parameters of manipulator loads and links. Int. J. Robot. Res. 5, 101–119. [Google Scholar]
  6. Awtar S., Bernard C., Boklund N., Master A., Ueda D., Craig K. (2002). Mechatronic design of a ball-on-plate balancing system. Mechatronics 12, 217–228. 10.1016/S0957-4158(01)00062-9 [DOI] [Google Scholar]
  7. Barto A. G., Fagg A. H., Sitkoff N., Houk J. C. (1999). A cerebellar model of timing and prediction in the control of reaching. Neural Comput. 11, 565–594. [DOI] [PubMed] [Google Scholar]
  8. Buonomano D. V., Mauk M. D. (1994). Neural network model of the cerebellum: temporal discrimination and the timing of motor responses. Neural Comput. 6, 38–55. [Google Scholar]
  9. Caligiore D., Pezzulo G., Baldassarre G., Bostan A. C., Strick P. L., Doya K., et al. (2017). Consensus paper: towards a systems-level view of cerebellar function: the interplay between cerebellum, basal ganglia, and cortex. Cerebellum 16, 203–229. 10.1007/s12311-016-0763-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Casellato C., Antonietti A., Garrido J. A., Carrillo R. R., Luque N. R., Ros E., et al. (2014). Adaptive robotic control driven by a versatile spiking cerebellar network. PLoS ONE 9:e112265. 10.1371/journal.pone.0112265 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Casellato C., Antonietti A., Garrido J. A., Ferrigno G., D'Angelo E., Pedrocchi A. (2015). Distributed cerebellar plasticity implements generalized multiple-scale memory components in real-robot sensorimotor tasks. Front. Comput. Neurosci. 9:24. 10.3389/fncom.2015.00024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Chapeau-Blondeau F., Chauvet G. (1991). A neural network model of the cerebellar cortex performing dynamic associations. Biol. Cybernet. 65, 267–279. [DOI] [PubMed] [Google Scholar]
  13. Chen C.-S. (2009). Sliding-mode-based fuzzy cmac controller design for a class of uncertain nonlinear system, in 2009 IEEE International Conference on Systems, Man and Cybernetics (San Antonio, TX: IEEE; ), 3030–3034. [Google Scholar]
  14. D'Angelo E., Antonietti A., Casali S., Casellato C., Garrido J. A., Luque N. R., et al. (2016). Modeling the cerebellar microcircuit: new strategies for a long-standing issue. Front. Cell. Neurosci. 10:176. 10.3389/fncel.2016.00176 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Dean P., Porrill J., Ekerot C. F., Jörntell H. (2010). The cerebellar microcircuit as an adaptive filter: experimental and computational evidence. Nat. Rev. Neurosci. 11, 30–43. 10.1038/nrn2756 [DOI] [PubMed] [Google Scholar]
  16. Ebner T. J., Hewitt A. L., Popa L. S. (2011). What features of limb movements are encoded in the discharge of cerebellar neurons? Cerebellum 10, 683–693. 10.1007/s12311-010-0243-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Falotico E., Vannucci L., Ambrosano A., Albanese U., Ulbrich S., Vasquez Tieck J. C., et al. (2017). Connecting artificial brains to robots in a comprehensive simulation framework: the neurorobotics platform. Front. Neurorobot. 11:2. 10.3389/fnbot.2017.00002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Farrell J. A., Polycarpou M. M. (2006). Adaptive Approximation Based Control: Unifying Neural, Fuzzy and Traditional Adaptive Approximation Approaches, Vol 48 Hoboken, NJ: John Wiley & Sons. [Google Scholar]
  19. Francis B. A., Wonham W. M. (1976). The internal model principle of control theory. Automatica 12, 457–465. [Google Scholar]
  20. Fujiki S., Aoi S., Funato T., Tomita N., Senda K., Tsuchiya K. (2015). Adaptation mechanism of interlimb coordination in human split-belt treadmill walking through learning of foot contact timing: a robotics study. J. R. Soc. Interface 12:20150542. 10.1098/rsif.2015.0542 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Fujita M. (1982). Simulation of adaptive modification of the vestibulo-ocular reflex with an adaptive filter model of the cerebellum. Biol. Cybernet. 45, 207–214. [DOI] [PubMed] [Google Scholar]
  22. Garrido Alcazar J. A., Luque N. R., D'Angelo E., Ros E. (2013). Distributed cerebellar plasticity implements adaptable gain control in a manipulation task: a closed-loop robotic simulation. Front. Neural Circuits 7:159 10.3389/fncir.2013.00159 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Glanz F. H., Miller W. T., Kraft L. G. (1991). An overview of the cmac neural network, in [1991 Proceedings] IEEE Conference on Neural Networks for Ocean Engineering (Washington, DC: IEEE; ), 301–308. [Google Scholar]
  24. Guan J., Lin C.-M., Ji G.-L., Qian L.-W., Zheng Y.-M. (2018). Robust adaptive tracking control for manipulators based on a tsk fuzzy cerebellar model articulation controller. IEEE Access 6, 1670–1679. 10.1109/ACCESS.2017.2779940 [DOI] [Google Scholar]
  25. He W., Chen Y., Yin Z. (2016). Adaptive neural network control of an uncertain robot with full-state constraints. IEEE Trans. Cybernet. 46, 620–629. 10.1109/TCYB.2015.2411285 [DOI] [PubMed] [Google Scholar]
  26. He W., Huang B., Dong Y., Li Z., Su C. Y. (2018). Adaptive neural network control for robotic manipulators with unknown deadzone. IEEE Trans. Cybernet. 48, 2670–2682. 10.1109/TCYB.2017.2748418 [DOI] [PubMed] [Google Scholar]
  27. Houk J. C., Wise S. P. (1995). Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: their role in planning and controlling action. Cereb. Cortex 5, 95–110. [DOI] [PubMed] [Google Scholar]
  28. Ito M. (1984). The Cerebellum and Neural Control. New York, NY: Raven Press. [Google Scholar]
  29. Ito M. (1997). Cerebellar microcomplexes, in International Review of Neurobiology, Vol 41 (Cambridge, MA: Academic Press; Elsevier; ), 475–487. [DOI] [PubMed] [Google Scholar]
  30. Ito M. (2008). Control of mental activities by internal models in the cerebellum. Nat. Rev. Neurosci. 9:304. 10.1038/nrn2332 [DOI] [PubMed] [Google Scholar]
  31. Jiang Y., Yang C., Wang M., Wang N., Liu X. (2018). Bioinspired control design using cerebellar model articulation controller network for omnidirectional mobile robots. Adv. Mechan. Eng. 10:1687814018794349 10.1177/1687814018794349 [DOI] [Google Scholar]
  32. Kawato M. (1990). Feedback-error-learning neural network for supervised motor learning, in Advanced Neural Computers, ed Eckmiller R. (Amsterdam: Elsevier; North-Holland; ), 365–372. 10.1016/B978-0-444-88400-8.50047-9 [DOI] [Google Scholar]
  33. Kawato M. (1999). Internal models for motor control and trajectory planning. Curr. Opin. Neurobiol. 9, 718–727. [DOI] [PubMed] [Google Scholar]
  34. Kawato M., Furukawa K., Suzuki R. (1987). A hierarchical neural-network model for control and learning of voluntary movement. Biol. Cybernet. 57, 169–185. [DOI] [PubMed] [Google Scholar]
  35. Khalil W., Dombre E. (2002). Modeling, Identification and Control of Robots. London, UK: Butterworth-Heinemann. [Google Scholar]
  36. Kocijan J., Murray-Smith R., Rasmussen C. E., Girard A. (2004). Gaussian process model based predictive control, in Proceedings of the 2004 American Control Conference, Vol 3 (Boston, MA: IEEE; ), 2214–2219. [Google Scholar]
  37. Koenig N. P., Howard A. (2004). Design and use paradigms for gazebo, an open-source multi-robot simulator, in IROS, Vol 4 (Sendai: Citeseer; ), 2149–2154. [Google Scholar]
  38. Koziol L. F., Budding D., Andreasen N., D'Arrigo S., Bulgheroni S., Imamizu H., et al. (2014). Consensus paper: the cerebellum's role in movement and cognition. Cerebellum 13, 151–177. 10.1007/s12311-013-0511-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Levinson S. E., Silver A. F., Wendt L. A. (2010). Vision based balancing tasks for the icub platform: a case study for learning external dynamics, in Workshop on Open Source Robotics: iCub & Friends. IEEE Int'l. Conf. on Humanoid Robotics (Nashville, TN: ). [Google Scholar]
  40. Lin C.-M., Chen C.-H. (2007). Robust fault-tolerant control for a biped robot using a recurrent cerebellar model articulation controller. IEEE Trans. Syst. Man Cybernet. Part B 37, 110–123. 10.1109/TSMCB.2006.881905 [DOI] [PubMed] [Google Scholar]
  41. Ljung L. (2007). Identification of Nonlinear Systems. Linköping: Linköping University Electronic Press. [Google Scholar]
  42. Luo D., Hu F., Zhang T., Deng Y., Wu X. (2018). How does a robot develop its reaching ability like human infants do? IEEE Trans. Cognit. Dev. Syst. 10, 795–809. 10.1109/TCDS.2018.2861893 [DOI] [Google Scholar]
  43. Luque N. R., Garrido J. A., Carrillo R. R., D'Angelo E., Ros E. (2014). Fast convergence of learning requires plasticity between inferior olive and deep cerebellar nuclei in a manipulation task: a closed-loop robotic simulation. Front. Comput. Neurosci. 8:97. 10.3389/fncom.2014.00097 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Luque N. R., Garrido J. A., Carrillo R. R., Olivier J.-M. C., Ros E. (2011). Cerebellarlike corrective model inference engine for manipulation tasks. IEEE Trans. Syst. Man. Cybernet. Part B 41, 1299–1312. 10.1109/TSMCB.2011.2138693 [DOI] [PubMed] [Google Scholar]
  45. Luque N. R., Garrido J. A., Naveros F., Carrillo R. R., D'Angelo E., Ros E. (2016). Distributed cerebellar motor learning: a spike-timing-dependent plasticity model. Front. Comput. Neurosci. 10:17. 10.3389/fncom.2016.00017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Marr D. (1969). A theory of cerebellar cortex. J. Physiol. 202, 437–470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Mauk M. D., Donegan N. H. (1997). A model of pavlovian eyelid conditioning based on the synaptic organization of the cerebellum. Learn. Memory 4, 130–158. [DOI] [PubMed] [Google Scholar]
  48. Metta G., Sandini G., Konczak J. (1999). A developmental approach to visually-guided reaching in artificial systems. Neural Netw. 12, 1413–1427. [DOI] [PubMed] [Google Scholar]
  49. Miller W. (1987). Sensor-based control of robotic manipulators using a general learning algorithm. IEEE J. Robot. Automat. 3, 157–165. [Google Scholar]
  50. Miller W. T., Glanz F. H., Kraft L. G. (1990). Cmas: an associative neural network alternative to backpropagation. Proc. IEEE 78, 1561–1567. [Google Scholar]
  51. Miyamoto H., Kawato M., Setoyama T., Suzuki R. (1988). Feedback-error-learning neural network for trajectory control of a robotic manipulator. Neural Netw. 1, 251–265. [Google Scholar]
  52. Nakanishi J., Cory R., Mistry M., Peters J., Schaal S. (2008). Operational space control: a theoretical and empirical comparison. Int. J. Robot. Res. 27, 737–757. 10.1177/0278364908091463 [DOI] [Google Scholar]
  53. Nakanishi J., Farrell J. A., Schaal S. (2005). Composite adaptive control with locally weighted statistical learning. Neural Netw. 18, 71–90. 10.1016/j.neunet.2004.08.009 [DOI] [PubMed] [Google Scholar]
  54. Nakanishi J., Schaal S. (2004). Feedback error learning and nonlinear adaptive control. Neural Netw. 17, 1453–1465. 10.1016/j.neunet.2004.05.003 [DOI] [PubMed] [Google Scholar]
  55. Narendra K. S., Annaswamy A. M. (1987). Persistent excitation in adaptive systems. Int. J. Cont. 45, 127–160. [Google Scholar]
  56. Narendra K. S., Mukhopadhyay S. (1991a). Associative learning in random environments using neural networks. IEEE Trans. Neural Netw. 2, 20–31. [DOI] [PubMed] [Google Scholar]
  57. Narendra K. S., Mukhopadhyay S. (1991b). Multilevel control of a dynamical systems using neural networks, in [1991] Proceedings of the 30th IEEE Conference on Decision and Control (Brighton, UK: IEEE; ), 170–171. [Google Scholar]
  58. Narendra K. S., Mukhopadhyay S. (1997). Adaptive control using neural networks and approximate models. IEEE Trans. Neural Netw. 8, 475–485. [DOI] [PubMed] [Google Scholar]
  59. Narendra K. S., Parthasarathy K. (1991). Gradient methods for the optimization of dynamical systems containing neural networks. IEEE Trans. Neural Netw. 2, 252–262. [DOI] [PubMed] [Google Scholar]
  60. Nashed J. Y., Kurtzer I. L., Scott S. H. (2015). Context-dependent inhibition of unloaded muscles during the long-latency epoch. J. Neurophysiol. 113, 192–202. 10.1152/jn.00339.2014 [DOI] [PubMed] [Google Scholar]
  61. Nguyen-Tuong D., Peters J. (2011). Model learning for robot control: a survey. Cogn. Proc. 12, 319–340. 10.1007/s10339-011-0404-1 [DOI] [PubMed] [Google Scholar]
  62. Nguyen-Tuong D., Seeger M., Peters J. (2009). Model learning with local gaussian process regression. Adv. Robot. 23, 2015–2034. 10.1163/016918609X12529286896877 [DOI] [Google Scholar]
  63. Nowak D. A., Topka H., Timmann D., Boecker H., Hermsdörfer J. (2007). The role of the cerebellum for predictive control of grasping. Cerebellum 6:7. 10.1080/14734220600776379 [DOI] [PubMed] [Google Scholar]
  64. Ojeda I. B., Tolu S., Lund H. H. (2017). A scalable neuro-inspired robot controller integrating a machine learning algorithm and a spiking cerebellar-like network, in Biomimetic and Biohybrid Systems - 6th International Conference, Living Machines 2017, Stanford, CA, USA, July 26-28, 2017, Proceedings (Stanford, CA: ), 375–386. [Google Scholar]
  65. Patino H. D., Carelli R., Kuchen B. R. (2002). Neural networks for advanced control of robot manipulators. IEEE Trans. Neural Netw. 13, 343–354. 10.1109/72.991420 [DOI] [PubMed] [Google Scholar]
  66. Porrill J., Dean P. (2007). Recurrent cerebellar loops simplify adaptive control of redundant and nonlinear motor systems. Neural Comput. 19, 170–193. 10.1162/neco.2007.19.1.170 [DOI] [PubMed] [Google Scholar]
  67. Probst D., Maass W., Markram H., Gewaltig M.-O. (2012). Liquid computing in a simplified model of cortical layer iv: Learning to balance a ball, in Artificial Neural Networks and Machine Learning – ICANN 2012, eds Villa A. E. P., Duch W., Érdi P., Masulli F., Palm G. (Berlin; Heidelberg: Springer Berlin Heidelberg; ), 209–216. [Google Scholar]
  68. Quigley M., Conley K., Gerkey B., Faust J., Foote T., Leibs J., et al. (2009). Ros: an open-source robot operating system, in ICRA Workshop on Open Source Software, Vol. 3: (Kobe: IEEE; ), 5. [Google Scholar]
  69. Schmahmann J. D. (2004). Disorders of the cerebellum: ataxia, dysmetria of thought, and the cerebellar cognitive affective syndrome. J. Neuropsych. Clin. Neurosci. 16, 367–378. 10.1176/jnp.16.3.367 [DOI] [PubMed] [Google Scholar]
  70. Scott S. H., Cluff T., Lowrey C. R., Takei T. (2015). Feedback control during voluntary motor actions. Curr. Opin. Neurobiol. 33, 85 – 94. 10.1016/j.conb.2015.03.006 [DOI] [PubMed] [Google Scholar]
  71. Sontag E. D. (1992). Feedback stabilization using two-hidden-layer nets. IEEE Trans. Neural Netw. 3, 981–990. [DOI] [PubMed] [Google Scholar]
  72. Ting J. A., Mistry M., Peters J., Schaal S., Nakanishi J. (2006). A bayesian approach to nonlinear parameter identification for rigid body dynamics, in Robotics: Science and Systems, eds Sukhatme G. S., Schaal S., Burgard W., Fox D. (Philadelphia, PA; Cambridge, MA: MIT Press; ), 32–39. [Google Scholar]
  73. Tokuda I. T., Hoang H., Kawato M. (2017). New insights into olivo-cerebellar circuits for learning from a small training sample. Curr. Opin. Neurobiol. 46, 58 – 67. 10.1016/j.conb.2017.07.010 [DOI] [PubMed] [Google Scholar]
  74. Tolu S., Vanegas M., Garrido J. A., Luque N. R., Ros E. (2013). Adaptive and predictive control of a simulated robot arm. Int. J. Neural Syst. 23:1350010. 10.1142/S012906571350010X [DOI] [PubMed] [Google Scholar]
  75. Tolu S., Vanegas M., Luque N. R., Garrido J. A., Ros E. (2012). Bio-inspired adaptive feedback error learning architecture for motor control. Biol. Cybernet. 106, 507–522. 10.1007/s00422-012-0515-5 [DOI] [PubMed] [Google Scholar]
  76. Ugur E., Nagai Y., Sahin E., Oztop E. (2015). Staged development of robot skills: behavior formation, affordance learning and imitation with motionese. IEEE Trans. Autonom. Mental Dev. 7, 119–139. 10.1109/TAMD.2015.2426192 [DOI] [Google Scholar]
  77. Vannucci L., Tolu S., Falotico E., Dario P., Lund H. H., Laschi C. (2016). Adaptive gaze stabilization through cerebellar internal models in a humanoid robot, in Proceedings of the IEEE RAS and EMBS International Conference on Biomedical Robotics and Biomechatronics (Singapore: ), 25–30. [Google Scholar]
  78. Vernon D., Metta G., Sandini G. (2007). A survey of artificial cognitive systems: Implications for the autonomous development of mental capabilities in computational agents. IEEE Trans. Evol. Comput. 11, 151–180. 10.1109/TEVC.2006.890274 [DOI] [Google Scholar]
  79. Vijayakumar S., Schaal S. (2000). Locally weighted projection regression: an o (n) algorithm for incremental real time learning in high dimensional space, in Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000) (Stanford, CA: ), 288–293. [Google Scholar]
  80. Weng J. (2002). A theory for mentally developing robots, in Proceedings 2nd International Conference on Development and Learning. ICDL 2002 (Cambridge, MA: IEEE; ), 131–140. [Google Scholar]
  81. Weng J., Evans C., Hwang W. S., Lee Y.-B. (1999a). The developmental approach to artificial intelligence: Concepts, developmental algorithms and experimental results, in NSF Design and Manufacturing Grantees Conference, Long Beach, CA. [Google Scholar]
  82. Weng J., Hwang W.-S. (2006). From neural networks to the brain: Autonomous mental development. IEEE Comput. Intell. Mag. 1, 15–31. 10.1109/MCI.2006.1672985 [DOI] [Google Scholar]
  83. Weng J., Hwang W. S., Zhang Y., Evans C. (1999b). Developmental robots: theory, method and experimental results, in Processdings 2nd International Conference on Humanoid Robots (Tokyo: Citeseer; ), 57–64. [Google Scholar]
  84. Weng J., McClelland J., Pentland A., Sporns O., Stockman I., Sur M., Thelen E. (2000). Computational Autonomous Mental Development. Citeseer. [DOI] [PubMed] [Google Scholar]
  85. Wittenmark B. (1995). Adaptive dual control methods: an overview, in Adaptive Systems in Control and Signal Processing 1995, ed Bányász Cs. (Elsevier: ), 67–72. [Google Scholar]
  86. Wolpert D. M., Miall R. C., Kawato M. (1998). Internal models in the cerebellum. Trends Cogn. Sci. 2, 338–347. [DOI] [PubMed] [Google Scholar]
  87. Yamazaki T., Tanaka S. (2007). The cerebellum as a liquid state machine. Neural Netw. 20, 290–297. 10.1016/j.neunet.2007.04.004 [DOI] [PubMed] [Google Scholar]
  88. Zhang T., Ge S. S., Hang C. C. (2000). Adaptive neural network control for strict-feedback nonlinear systems using backstepping design. Automatica 36, 1835–1846. 10.1016/S0005-1098(00)00116-3 [DOI] [Google Scholar]

Articles from Frontiers in Neurorobotics are provided here courtesy of Frontiers Media SA

RESOURCES