Learning Task-Specific Dynamics to Improve Whole-Body Control
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
Learning Task-Specific Dynamics to Improve Whole-Body Control Andrej Gams1 , Sean A. Mason2 , Aleš Ude1 , Stefan Schaal2,3 and Ludovic Righetti3,4 Abstract— In task-based inverse dynamics control, reference accelerations used to follow a desired plan can be broken down into feedforward and feedback trajectories. The feedback term accounts for tracking errors that are caused from inaccurate dynamic models or external disturbances. On underactuated, free-floating robots, such as humanoids, high feedback terms can be used to improve tracking accuracy; however, this can arXiv:1803.01978v2 [cs.RO] 8 Mar 2018 lead to very stiff behavior or poor tracking accuracy due to limited control bandwidth. In this paper, we show how to reduce the required contribution of the feedback controller by incorporating learned task-space reference accelerations. Thus, we i) improve the execution of the given specific task, and ii) offer the means to reduce feedback gains, providing for greater compliance of the system. With a systematic approach we also reduce heuristic tuning of the model parameters and feedback gains, often present in real-world experiments. In contrast to learning task-specific joint-torques, which might produce a similar effect but can lead to poor generalization, our approach directly learns the task-space dynamics of the center of mass of a humanoid robot. Simulated and real-world results on the lower part of the Sarcos Hermes humanoid robot Fig. 1. Simulated and real-world lower part of the Sarcos Hermes humanoid demonstrate the applicability of the approach. robot used in the experiments. I. INTRODUCTION feedback gains to ensure good tracking performance on real Unmodeled dynamics (i.e. friction, link flexibilities, un- robots which do not have accurate dynamic models. modeled actuator dynamics or approximate model parame- Acquiring dynamic models of robots and task can be ters) can have severe effects on tracking performance which partially offset by iterative learning of the control signals, can be problematic for balancing legged robots. Indeed, which relies on one of the main characteristics of robots models are often difficult to obtain and/or incorrect, and – repeatability of actions in case of the same input signals. while parameter identification can improve their quality [1], Iterative learning control (ILC) [7] was extensively applied in it does not take into account unmodeled dynamic effects. robotics, including for learning of task-specific joint control The lack of accurate models has led to the use of combined torques [8]. feedforward and feedback control, which preserves stability and robustness to disturbances. The greater the error of A. Problem Statement the model, the greater the feedback gains necessary to In this paper we investigate task-specific learning of ensure robust task achievement. This comes at the cost of dynamics for improved task execution and compliance in a significant increase in stiffness and damping requirements. the scope of optimization-based inverse dynamics control. Different methods of acquiring dynamic models [2] and Therefore, we want: exploiting them in control have been proposed [3]. Optimiza- • to reduce the required contribution of the feedback term tion based approaches, such as hierarchical inverse dynamics in the control, [4]–[6], have gained in popularity in the recent years for • consequently improve task-space tracking and compli- the control of legged robots. However, these approaches ance, and rely a dynamics model and often necessitate high task-space • act in the system’s task space, typically the center of *This work was supported by the Slovenian Research Agency project mass (CoM) dynamics. BI-US/16-17-063. Acting in task space as opposed to joint space potentially 1 Humanoid and Cognitive Robotics Lab, Dept. of Automatics, Bio- offers the means for application beyond the scope of the cybernetics and Robotics, Jožef Stefan Institute, Ljubljana, Slovenia. name.surname@ijs.si learned dynamics, i. e., through generalization. 2 Computational Learning and Motor Control Lab, University of Southern The intended application of the proposed algorithm is California, Los Angeles, California, USA. improved control of dynamic tasks on humanoid robots. We 3 Autonomous Motion Department, Max Planck Institute for Intelligent Systems, Tuebingen, Germany. performed our experiments on the lower part of the Sarcos 4 Tandon School of Engineering, New York University, New York, USA. Hermes humanoid robot, depicted in Fig. 1.
This paper is organized as follows:. Section II provides models that can then be used to synthesize control laws. a short literature overview. Section III gives a short recap Iterative methods to compute locally optimal control policies on the QP inverse dynamics control. Section IV explains the have, for example, recently been used with such learned proposed algorithm of applying learned torques in quadratic- models [21]. Iterative repetitions of the process are then used program based inverse dynamics control. Section V depicts to collect additional data and re-learn a new policy [21], simulated and real-world results. A discussion is in Section [22]. In these approaches, the learned models and resulting VI, followed by conclusion and future prospects of this work. control policies operate on the whole robot. It is therefore not clear how other tasks and additional constraints can II. RELATED WORK be incorporated without impairing the resulting behaviors. Two bodies of work are relevant to this research: 1) In this paper, we learn task-specific feedforward models learning and exploiting control torques for improved task which take into account the error in dynamic models during execution, and 2) optimization based control of floating-base task execution and combine them with a QP-based inverse systems. dynamics controller. This allows to significantly improve task Learning of control signals to iteratively improve task tracking while creating more compliant behaviors. execution using the aforementioned ILC [7], has been exten- sively used in robotics [9]. Feedback-error-learning, where III. CONTROL the feedback motor command is used as an error signal to In this Section we briefly introduce our QP based inverse train a neural network which then generates a feedforward dynamics, originally proposed in [4]. motor command, was proposed by Kawato [10]. The idea was extended for learning of contact interaction. In their A. QP-Based Hierarchical Inverse Dynamics work, Pastor et al. [11] combined learning of forces with The floating-base dynamics of a legged humanoid robot dynamical systems to improve force tracking in interac- are formulated given by tion tasks. Similarly, in [12], dynamic movement primitives (DMP) [13] were used in combination with learning of M(q)q̈ + h(q, q̇) = ST τ + JTc λ, (1) interaction forces. Learning to improve contact interaction was also applied with a vector of position and orientation of the robot in space directly to actuation torques. Measured interaction force was and its joint configurations q ∈ SE(3) × Rn , the mass-inertia used to calculate the joint-torques from the measured arm matrix M ∈ Rn+6×n+6 , the nonlinear terms h ∈ Rn+6 , pose in [14]. These were applied in a feed-forward man- the actuation matrix S ∈ [On×6 In×n ] and the end-effector ner in control. Similarly, joint torques along the kinematic contact Jacobian Jc ∈ R6m×n , where n is the number of trajectory were learned and encoded as DMPs in [15] and robot’s degrees of freedom, τ are actuation torques and λ used to increase the accuracy in the next execution of the in- are the contact forces. contact task. This approach to improving trajectory execution Given that a humanoid robot is a floating-base system, its was also applied to full-sized humanoid robots, for example dynamics can be decomposed into actuated and unactuated in [16], where a particular trajectory was run, and the joint (floating base) parts, respectively torques from that trial were used as the feedforward term on the next trial. These methods can go beyond mere repetition Ma (q)q̈ + ha (q, q̇) = τ + JTc,a λ (2) of the same task. In [8], the authors show how the learned Mu (q)q̈ + hu (q, q̇) = JTc,u λ (3) joint-torque signals, encoded in parametric form as compliant movement primitives (CMPs), can be generalized for new Equation (2) shows that the actuation torques, τ are linearly task parameters (weight, speed, etc.). However, because the determined by q̈ and λ. Therefore it is a redundant variable approach is rooted in joint-torques, the generalization is and optimization needs to be performed only over q̈ and λ somewhat limited to variations of the same task. and dynamic consistency is reduced to enforcing Eq. (3). Optimization-based inverse dynamics control has become Kinematic contact constrains ensuring that the part of the a very popular method to control legged robots [4], [5], [17], robot in contact with the environment do not move [18] as it allows to directly specify control objectives for Jc q̈ + J̇c q̇ = 0 (4) multiple tasks, while ensuring priorities between tasks and important constraint satisfaction (actuation limits, friction are additional equality constraints of the optimization, while cones, etc). The redundancy of a complex robot such as a foot center of pressure (CoP), friction forces, resultant humanoid can therefore be optimally exploited. It is also normal torques, joint torques and joint accelerations are possible to compute optimal feedback gains in a receding inequality constraints. The cost to be minimized is horizon manner directly in task space by leveraging the task reduced dynamics [4], [6], [19] Such methods have remark- min ||ẍ − ẍdes ||2Wx + ||Pnull q̈ − q̈des ||2Wq + q̈,λ (5) able capabilities, but are often limited by modeling errors ||λ − λdes ||2Wλ + ||τ − τdes ||2Wτ . which necessitate to significantly increase feedback gains which leads to very stiff behaviors. Some approaches, as far Here, W denotes weighting matrices, Pnull projects into the back as [20], have proposed to learn task-specific dynamic null space of the Cartesian task constraints, and x ∈ R6m are
m stacked Cartesian endeffector poses. Joint and Cartesian A. Encoding the Feedforward Signal endeffectors are related through The recorded (learned) signal can be encoded in any form. The focus in this paper is on periodic task. We chose to ẍ = Jt q̈ + J̇t q̇, (6) encode the signal as a linear combination of radial basis with Jt ∈ R6mt ×n+6 is the Jacobian for mt unconstrained functions (RBF) appropriate for periodic tasks. Thus, the endeffectors and Jc for mc constrained ones. signal encoding is compact and the signal itself is inherently filtered. Most importantly, as discussed in [8], this repre- Desired end-effector accelerations are computed through sentation allows for computationally light1 generalization ẍdes = ẍref + Px (xref − x ) + Dx (ẋref − ẋ ) . (7) using Gaussian Proces Regression (GPR) [23]. A linear combination of RBF as a function of the phase φ is given Matrices Px and Dx represent stiffness and damping gains by PL of the PD controller, respectively. Desired joint accelerations j=1 wj Γj (φ) are specified by ẍfb, i−1 (φ) = PL r, (10) j=1 Γj (φ) q̈ = Pq (qref − q) − Dq . where Γi denotes the basis function, given by For more details on the solver, see [4]. Γj (φ) = exp(hi (cos(φ − ci ) − 1)), (11) wj is the weight of the j-th basis function, L is the number IV. TASK S PECIFIC DYNAMICS of basis functions, r is the amplitude gain, ci are centers of With QP-based inverse dynamics control we can control the basis functions and hi > 0 their widths. The periodic the behavior of the robot. To improve the tracking we periodic phase φ is determined by the phase oscillator introduce an additional feedforward term. φ̇ = Ω, (12) The results of using QP-based inverse dynamics control will depend on the accuracy of the model. This can be seen where Ω is the frequency of oscillations. in (7), where B. Generalization ẍdes = ẍref + Px (xref − x ) + Dx (ẋref − ẋ ) . |{z} | {z } In the manner of generalizing joint-space feedforward feedforward feedback torques [8], we can also generalize the learned task-space If the model were perfect, the contribution of the feedback CoM accelerations. Having chosen RBS encoding, we can part would amount to 0. However, it is not, and the feedback generalize between the weights, for example using GPR. The part accounts for the discrepancy. We propose recording the goal of generalization is to provide us with a function contribution part and adding it in the next repetition of the FDb : κ 7−→ [w ] (13) exact same task (desired motion). Thus, we get that provides the output in the form of a vector of RBF weights w, given the database of trained feedforward terms ẍdes,i = ẍref + ẍfb, i−1 +Px (xref − x ) + Dx (ẋref − ẋ ) , Db and the input, i. e., the query κ. We refer the reader to | {z } updated feedforward [8], [23] for details on GPR. (8) While this kind of generalization is analog to the one in where [8], but in task space, potential generalization across tasks might offer more possibilities. ẍfb, i−1 = Px (xref − xi−1 ) + Dx (ẋref − ẋi−1 ) . (9) V. EXPERIMENTAL RESULTS We only used the feedback from 1 previous iteration (i = 2). Experiments were performed on the lower part of the Unlike in [16] or [8], the feedforward part is added in the hydraulically actuated, torque controlled, Sarcos humanoid task-space of the robot, and not in its joint space. It is also robot. It has in total 17 degrees of freedom (DoFs), with 7 combined with QP-based optimization control. in each leg and three in the torso. The system is depicted in Using the recorded (learned) feedback signal in the next Fig. 1. We used SL simulation and control environment [24] iteration of the same task provides us with an improved feed- for evaluation in simulation. forward signal, which is task-specific. However, we can build To demonstrate the applicability of the result we chose up a database of such signals for different task variations. the task of periodic squatting with a predefined frequency. Thus, the proposed algorithm can significantly correct the We compared the position of the CoM without and with the discrepancy of the model and the real system. Furthermore, added feedforward term. we can use the database to generate an appropriate signal for 1 The calculation of the hyperparameters for GPR is computationally previously untested tasks and task variations using statistical expensive, but it is performed offline. Simple matrix multiplication is generalization [8]. performed online.
rection. The plot also shows the encoded feedforward signal (purple),which very closely matches the original feedback signal. The matching could be increased, for example, with a higher number of basis functions. In the experiments we used L = 25 basis functions; the number was chosen empirically. B. Generalization We performed a basic generalization experiment over a variation of the task, to show that the approach can be used with generated feedforward signals. We used GPR to generalize the feedforward term for a squatting amplitude of κ = 5 cm. The database consisted of feedforward terms for different squatting amplitudes, i. e., task-parameters κ = 2, 4, 6, 8 cm2 . Fig. 4 shows the results, where we can see that the generalized feedforward term allowed for very similar, low CoMz tracking errors as the recorded feedforward term for κ = 5 cm. Fig. 2. Center of mass position during a squatting experiment without (i − 1) and with (i) the learned feedforward term, and with the feedforward C. Real World Results term but with lower gains (i0 ). The top plot shows the CoM position in We performed the same experiment on the real robot. To the vertical z axis for all experiments in the world coordinate frame. Black dashed line depicts the start of the experiment, green-dashed line shows the increase the dynamics of the task, we tested the approach on desired motion, blue line shows the results without the added feedforward two different frequencies: 0.25 Hz and 0.5 Hz. The error of term (i − 1), red line shows the results with the added feedforward term (i), CoMz tracking for both cases is depicted in Fig. 5. We can and ocher line shows the results with the added feedforward term but with 5 times reduced P feedback gains (i0 ). The bottom plot show the error of see in the plots that the error is again significantly reduced CoMz tracking for all three cases. for both cases. Fig. 6 shows a series of still photos depicting the real system performing one squat. A. Simulation VI. DISCUSSION In the first experiment, we performed squatting at a With regard to our problem statement, we can say that frequency of 0.25 Hz. Squatting was defined as sinusoidal the proposed approach thoroughly resolves it. The results motion of the center of mass up and down for 6 cm; this show a reduced contribution of the feedback term; a clear range is close to the maximal motion the robot can perform improvement of the tracking performance and the possibility without hitting joint limits. to reduce the feedback gains to thus increase the compliance; In simulation, the model used to compute the inverse and the approach is applied in the task space of the system. dynamics control is perfect; however, we set the task space In the following we briefly discuss these points. gains of the optimization such that it did not reach perfect We first discuss the improvement of the tracking per- tracking. Fig. 6 shows a sequence of pictures illustrating the formance and the reduction of the feedback term contri- squatting task. bution. The question whether the system behavior is the Fig. 2 shows in the top plot the desired and actual vertical same with the additional task-specific feedforward term and motion of CoM for three cases: without (i − 1) the added term, with the added term (i), and with the added term with 2 The database is too small. Typically it would consist of at least 10 entries, reduced P0 = 0.2P gains (i0 ). Improvement of tracking is but in this toy example they would be too close together. clearly visible in the top plot for both cases using the added feedforward term. The tracking error is depicted in the lower 0.2 plot. Here we can see that the addition of the feedforward term reduces the tracking error by an order of magnitude 0.1 even with reduced P gains. Before the start of the squatting (depicted by the black dashed line), the learned feedforward 0 term was not added to the system. With low gains (i0 ), this clearly results in higher error of the CoMz tracking, bit in -0.1 higher compliance as well. The higher compliance is directly related to the error of tracking before the feedforward term -0.2 5 10 15 20 25 is added, because the error of the model can be interpreted as an external force. The plot in Fig. 3 shows the amplitude of the feedback part Fig. 3. The contribution of the feedback term in QP-based inverse dynamic control when squatting as defined in the experiment. Without an additional of the controller, again without (i − 1), with (i), and with the feedforward term in blue (i − 1), with the additional term in red (i), with additional feedforward term but with reduced feedback gains the term but with reduced gains in ocher (i0 ), and the encoded feedforward (i0 ). We can again see the difference in needed feedback cor- signal in purple.
the system has been recognized as one of the key elements for real-world deployment of robots in unstructured environ- ments, as it provides robustness for unplanned disturbances [17]. Our approach reduces the contribution of the feedback term in a manner similar to an improved dynamic model. As shown in the literature (e. g. in [1]), a dynamic model never completely describes the behavior of a complex system. On a real system, such as the lower part of the Hermes bipedal robot used in this work, this difference is easily explained by the influence of hydraulic hoses that were not modelled. The difference of our approach to improving the actual model is that our approach is task-specific. The learned feedback torques for squatting are by default only applicable to squatting. As already outlined in Section III, building up a database is a rather straightforward solution. This has not only been applied to the model (for lack of a better word), but Fig. 4. Center of mass position during a squatting experiment without (i−1) also to control policies as a result of optimization. In [26], and with(i) the learned feedforward term, and with the feedforward term a database of such control policies is used to warm-start the generated from the database (i00 ). The top plot shows the CoM position in optimization. A more advanced solution than just building the vertical z axis for all three experiments in the world coordinate frame. Black dashed line depicts the start of the experiment, green-dashed line up a database is to use the database to generate feedforward shows the desired motion, blue line shows the results for i − 1, red line terms for previously untrained situations. Different methods shows the results for i, and ocher line shows the results for the generalized can be applied, for example statistical learning, such as GPR, feedforward term i00 . The bottom plot show the error of CoMz tracking for all three cases. which was used in a similar manner for joint torques in [8]. A broad-enough database could potentially be applied 0.01 with deep neural networks, with the output the weights of the CMP. A similar approach was applied for kinematic 0 trajectories, which were generated from raw images in [27]. The database would have to be rather substantial, though. -0.01 The real advantage of learning and applying the feedfor- ward terms in task space is the similarity it has over different -0.02 tasks. For example, during walking, the CoM position is 0 5 10 15 20 25 30 moving from one side to the other, which is (from CoM point 0.01 of view) the same as simply shifting the weight from one side to the other. The advantage of the proposed approach is quite 0 obvious, as feedforward terms for tasks that are easier to successfully implement can be used in a feedforward manner for more complex tasks. In the given case, feedforward terms -0.01 for shifting of the weight from one side to the other can be used for walking. However, this cross-task generalization -0.02 0 5 10 15 20 25 30 remains an open research question. VII. CONCLUSION & FUTURE WORK Fig. 5. Real-world error of CoMz tracking with and without the added term for the squatting experiment at two different squatting frequencies, 0.25 Hz In this paper we showed that basic iterative learning can in the top and 0.5 Hz in the bottom. In both plots the results without the be applied to task-space accelerations in order to improve added term are in blue, and with the added feedforward term in red. the task execution of a complex, free-floating robot system, controlled with QP-based inverse dynamic control. Our ap- low feedback gains or without the additional task-specific proach has fully resolved the specified problem statement. feedforward term and high gains, has previously been studied Furthermore, it has the potential to substantially improve the by Goldsmith [25]. The author shows that an equivalent behavior of model-based control methods with application feedback can always be constructed from the ILC parameters of learned signals across different task. The latter, however, with no additional plant knowledge, whether or not the ILC remains a task for future work. includes current-cycle feedback. Note that in (8), current R EFERENCES cycle feedback is present. If the ILC converges to zero error, the equivalent feedback is a high gain controller. Herein lies [1] M. Mistry, S. Schaal, and K. Yamane, “Inertial parameter estimation of floating base humanoid systems using partial force sensing,” in the main advantage of using ILC – low feedback gains can 2009 9th IEEE-RAS International Conference on Humanoid Robots, be used, resulting in increased compliance. Compliance of pp. 492–497, Dec 2009.
Fig. 6. Still images of the lower part of the Sarcos Hermes robot performing one squat in simulation (top) and on the real robot (bottom). See also the accompanying video for the real-world experiment. [2] D. Nguyen-Tuong and J. Peters, “Model learning for robot control: a [16] N. S. Pollard, J. K. Hodgins, M. J. Riley, and C. G. Atkeson, “Adapting survey,” Cognitive Processing, vol. 12, pp. 319–340, Nov 2011. human motion for the control of a humanoid robot,” in Proceedings [3] D. W. Franklin and D. M. Wolpert, “Computational mechanisms of 2002 IEEE International Conference on Robotics and Automation, sensorimotor control,” Neuron, vol. 72, no. 3, pp. 425 – 442, 2011. vol. 2, pp. 1390–1397 vol.2, 2002. [4] A. Herzog, N. Rotella, S. Mason, F. Grimminger, S. Schaal, and [17] B. J. Stephens and C. G. Atkeson, “Dynamic balance force control L. Righetti, “Momentum control with hierarchical inverse dynamics on for compliant humanoid robots,” in 2010 IEEE/RSJ International a torque-controlled humanoid,” Autonomous Robots, vol. 40, pp. 473– Conference on Intelligent Robots and Systems, pp. 1248–1255, Oct 491, Mar 2016. 2010. [5] A. Escande, N. Mansard, and P.-B. Wieber, “Hierarchical quadratic [18] M. DeDonato, V. Dimitrov, R. Du, R. Giovacchini, K. Knoedler, programming: Fast online humanoid-robot motion generation,” The X. Long, F. Polido, M. A. Gennert, T. Padr, S. Feng, H. Moriguchi, International Journal of Robotics Research, vol. 33, pp. pp. 1006– E. Whitman, X. Xinjilefu, and C. G. Atkeson, “Human-in-the-loop 1028, June 2014. control of a humanoid robot for disaster response: A report from the [6] S. Kuindersma, F. Permenter, and R. Tedrake, “An efficiently solvable darpa robotics challenge trials,” Journal of Field Robotics, vol. 32, quadratic program for stabilizing dynamic locomotion,” in 2014 IEEE no. 2, pp. 275–292, 2015. Conference on Robotics and Automation, pp. 2589–2594, IEEE, 2014. [19] A. Herzog, N. Rotella, S. Schaal, and L. Righetti, “Trajectory genera- [7] D. A. Bristow, M. Tharayil, and A. G. Alleyne, “A survey of iterative tion for multi-contact momentum control,” in IEEE-RAS International learning control,” IEEE Control Systems, vol. 26, pp. 96–114, June Conference on Humanoid Robots (Humanoids), pp. 874–880, 2015. 2006. [20] S. Schaal and C. Atkeson, “Robot juggling: implementation of [8] M. Deniša, A. Gams, A. Ude, and T. Petrič, “Learning compliant memory-based learning,” IEEE Cont. Sys., vol. 14, pp. 57 – 71, 1994. movement primitives through demonstration and statistical generaliza- [21] S. Levine and P. Abbeel, “Learning neural network policies with tion,” IEEE/ASME Transactions on Mechatronics, vol. 21, pp. 2581– guided policy search under unknown dynamics,” in Advances in Neural 2594, Oct 2016. Information Processing Systems 27 (Z. Ghahramani, M. Welling, [9] M. Norrlöf, Iterative Learning Control – Analysis, Design, and Ex- C. Cortes, N. D. Lawrence, and K. Q. Weinberger, eds.), pp. 1071– periments. PhD thesis, Linköpings Universitet, 2000. 1079, Curran Associates, Inc., 2014. [10] M. Kawato, “Feedback-error-learning neural network for supervised [22] M. P. Deisenroth and C. E. Rasmussen, “Pilco: A model-based and motor learning,” in Advanced Neural Computers (R. ECKMILLER, data-efficient approach to policy search,” in Proceedings of the 28th ed.), pp. 365 – 372, Amsterdam: North-Holland, 1990. International Conference on International Conference on Machine [11] P. Pastor, M. Kalakrishnan, L. Righetti, and S. Schaal, “Towards Learning, ICML’11, (USA), pp. 465–472, Omnipress, 2011. associative skill memories,” in 2012 12th IEEE-RAS International [23] C. Rasmussen and C. Williams, Gaussian processes for machine Conference on Humanoid Robots (Humanoids), pp. 309–315, 2012. learning. Cambridge, MA, USA: MIT Press, 2006. [12] A. Gams, B. Nemec, A. J. Ijspeert, and A. Ude, “Coupling movement [24] S. Schaal, “The sl simulation and real-time control software package,” primitives: Interaction with the environment and bimanual tasks,” tech. rep., Los Angeles, CA, 2009. clmc. IEEE Transactions on Robotics, vol. 30, pp. 816–830, Aug 2014. [25] P. B. Goldsmith, “On the equivalence of causal lti iterative learning [13] A. J. Ijspeert, J. Nakanishi, H. Hoffmann, P. Pastor, and S. Schaal, control and feedback control,” Automatica, vol. 38, no. 4, pp. 703 – “Dynamical Movement Primitives: Learning Attractor Models for 708, 2002. Motor Behaviors.,” Neural computation, vol. 25, pp. 328–73, 2013. [26] N. Mansard, A. Del Prete, M. Geisert, S. Tonneau, and O. Stasse, [14] R. Calandra, S. Ivaldi, M. Deisenroth, and J. Peters, “Learning torque “Using a Memory of Motion to Efficiently Warm-Start a Nonlinear control in presence of contacts using tactile sensing from robot skin,” Predictive Controller,” in 2018 IEEE International Conference on in IEEE-RAS 15th International Conference on Humanoid Robots Robotics and Automation, to appear, 2018. (Humanoids), pp. 690–695, Nov 2015. [27] R. Pahič, A. Gams, A. Ude, and J. Morimoto, “Deep Encoder-Decoder [15] F. Steinmetz, A. Montebelli, and V. Kyrki, “Simultaneous kinesthetic Networks for Mapping Raw Images to Movement Primitives,” in teaching of positional and force requirements for sequential in-contact 2018 IEEE International Conference on Robotics and Automation, to tasks,” in IEEE-RAS 15th International Conference on Humanoid appear, 2018. Robots (Humanoids), pp. 202–209, Nov 2015.
You can also read