Decoupled Neural Network Reference Compensation Technique for a PD Controlled Two Degrees-of-Freedom Inverted Pendulum
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
92 International Journal of Control, Automation, and Systems Vol. 2, No. 1, March 2004 Decoupled Neural Network Reference Compensation Technique for a PD Controlled Two Degrees-of-Freedom Inverted Pendulum Seul Jung and Hyun Taek Cho Abstract: In this paper, the decoupled neural network reference compensation technique (DRCT) is applied to the control of a two degrees-of-freedom inverted pendulum mounted on an x-y table. Neural networks are used as auxiliary controllers for both the x axis and y axis of the PD controlled inverted pendulum. The DRCT method known to compensate for uncertainties at the trajectory level is used to control both the angle of a pendulum and the position of a cart si- multaneously. Implementation of an on-line neural network learning algorithm has been imple- mented on the DSP board of the dSpace DSP system. Experimental studies have shown success- ful balancing of a pendulum on an x-y plane and good position control under external distur- bances as well. Keywords: Neural network, 2 DOF inverted pendulum, on-line learning, DSP system. 1. INTRODUCTION tries to move toward one direction to minimize a po- sitional error while a controller for the pendulum tries Inverted pendulum application using various con- to move in the opposite direction to minimize the trol methods has been a typical example for advanced angle error. When this contradiction occurs, it is dif- control education, as well as interesting research. ficult to decide the suitable control law. This is one Control of an inverted pendulum has been considered reason that the conventional fixed PD controller can- as a fascinating, but difficult problem to solve since not control both the angle of the pendulum and the the system has very challenging characteristics such position of the cart concurrently. To tackle this prob- as nonlinearity and a single-input multi-output struc- lem, suitable controllers’ gains for various cases ture [1, 2]. Many successful results using the ad- should be considered. vanced control theories for balancing the inverted Fuzzy algorithm is a good candidate for solving pendulum using a cart have been reported throughout this type of problem, but the assessment of fuzzy various literatures [1, 3]. Those successful results rules for fuzzification does not simply satisfy both have been mainly focused on balancing the pendulum, objectives [7-9]. Several trial and error experiments rather than on controlling the position of the cart. are required to obtain a certain satisfaction. Visual Recently, successful control of both the angle and the feedback control for the inverted pendulum has also position of the inverted pendulum has been demon- been proposed [9, 10]. Rather than solely depending strated by practical experiments [4-6]. upon encoder signals, controlling the balance of the The nature of controlling both the angle of the pendulum relies on visual feedback. The performance pendulum and the position of the cart with a single of visual feedback control is dependent upon the ac- input force has remained as an open nonlinear prob- curacy of the vision system. lem to be overcome. The difficulty of controlling Neural network based control is another good can- both the angle and the position of the inverted pendu- didate for this application. In our previous researches, lum system comes from different dynamic movement control of both the angle and the position of the in- patterns of the pendulum and the cart. For example, verted pendulum system has been successfully per- let us consider the case that a controller for the cart formed on a large x-y table robot [4, 6]. However, __________ due to the large size of our previous system, position Manuscript received February 20, 2003; revised October 16, control and balancing of the inverted pendulum of 2003; accepted December 15, 2003. Recommended by Edito- both axes on the x-y plane failed. One of the reasons rial Board member Jin Young Choi under the direction of Edi- for the failure was that the actuated motor could not tor Jin Bae Park. This research has been supported by the Ko- rea Research Foundation with the contract of KRF 2002-002- generate enough torque for rapid movement of the D000765. The authors would like to thank them for their kind axis due to the heavy weight of one side of the two support. axes. Seul Jung and Hyun Taek Cho are with the Department of In this paper, as an extension of our previous re- Mechatronics Engineering, Chungnam National University, search [4, 6], control of the inverted pendulum on the 220 Gung-dong, Yuseong-gu, Daejeon 306-764, Korea (e- x-y table is revisited. Like the balancing of a stick on mail: jungs@cnu.ac.kr, hyuntkc@empal.com).
International Journal of Control, Automation, and Systems Vol. 2, No. 1, March 2004 93 a palm, a pendulum can freely move on the x-y plane. Connect Panel An x-y table is newly built and its size is relatively ds1003 reduced to generate faster movements. Neural net- Data works are also used as auxiliary controllers to help PWM the PD controller for the system to minimize the er- dSPACE dSPACE rors of angles and positions of each axis. Differing ControlDesk Computer Expansion Box Motor from the previous research, a decoupled neural net- Driver Motor Power work structure is used. Decoupled neural networks X-Y Belt Table Motor Power structure means that two separate neural networks are used for controlling each axis of the x-y table instead of using a single neural network. Since the structure Z of the x-y table is a more likely decoupled system, X the use of two separate neural networks is suitable for eliminating any coupling effects. For experiments, the newly designed smaller x-y Y table with the dSpace DSP system is implemented. Fig. 1. The overall system. Interface between the robot and the DSP system has been implemented to drive the motors of each axis. On-line neural computation algorithm is developed and implemented on the DSP board of the dSpace system to achieve real time control. Successful re- sults of maintaining balance of the pendulum and position control of the cart on the x-y plane have been obtained by the proposed control algorithm. v mg h 2. SYSTEM STRUCTURE Ux 2.1. Overall system structure The overall system structure is shown in Fig. 1. The system consists of three parts: a controller, a x pendulum on an x-y table, and actuators. The control component includes a computer with the dSpace DSP Fig. 2. Inverted pendulum model. board and interface. The DSP board is used to calcu- late the neural network learning algorithm in an on- and horizontal forces are given as follows: line fashion. The main body includes an inverted x + Lθx cos θ x − Lθx2 sin θ x ) , h = m ( pendulum and an x-y table. The size of the x-y table (2) is 0.8m × 0.9m. As shown in Fig. 1, a 2-DOF in- v − mg = − mLx sin θ x − mLθx2 cos θ x , verted pendulum is mounted on the x-axis of the ro- bot. The x-axis moves along the y-axis with an LM where m is the mass of a pendulum and x is the Guide. Two axes are actuated by two DC motors displacement in the x axis. through timing belts. Belt tension is often considered For a cart, as an uncertainty. The neural network control algorithm is imple- x = ux − h , M x (3) mented on the DSP board and the board generates PWM signals to the motor drivers. where M x is the mass of a cart in the x axis. Solving for θx and x by combining (1), (2) and 2.2. Inverted pendulum (3) yields the dynamic model of an inverted pendu- The dynamics of an inverted pendulum can be lum shown as follows: modeled in x and y directions separately. The dy- namic equation for the x axis is x + mL cos θ xθx − mL sin θ xθx2 + b x x = u x , ( M x + m ) Jθx = v sin θ x − hL cosθ x + bθ x θ x , (1) (4) where J is an inertia of a pendulum, θ x is the ( J + mL2 )θx + mL cos θ x x − mgL sin θ x − bθ x θx = 0 . angle of x axis, L is the length of a pendulum, bθ x is (5) friction constant, and v and h are vertical force and horizontal force vectors, respectively. Vertical We know from equations (4) and (5) that the
94 International Journal of Control, Automation, and Systems Vol. 2, No. 1, March 2004 a two degrees-of-freedom pendulum, the combined two separate control inputs are required. The control input u x for an x axis is formed by add- ing two controllers’ outputs from an angle error and a displacement error. The errors are defined as follows: eθ = θ d − θ , ex = xd − x , (10) where eθ is the angle error and ex is the positional error. The PD controller is now formed as _ _ _ u x = u θ x + u px , (11) Fig. 3. PD controlled pendulum system. _ system has a single input u x and two outputs x u θ x = k pθ x ⋅ eθ + k d θ x ⋅ eθ , and θ for an x axis. _ (12) Combining (4) and (5) yields u px = k ppx ⋅ e x + k dpx ⋅ e x , M ′( mgLsθ x + bθ x θx ) − mLcθ x ( mLsθ xθx2 − bx x + u x ) _ _ θx = where u θ x and u px are nominal control inputs for M ′J ′ − m 2 L2 c 2θ x (6) a pendulum and for a cart of the x axis, respectively. _ _ J ′( mLsθ xθx2 − bx x + u x ) − mLcθ x ( mLgsθ x + bθ x θx ) The control input u θ y and u p y for y axis can x= be represented in a similar way. The difficulty of con- M ′J ′ − m 2 L2 c 2θ x trolling the pendulum angle and the cart position si- (7) multaneously comes from different configurations where _ _ u θ and u p . M ′ = M x + m, J ′ = J + mL2 , sθ = sin θ , cθ = cosθ . The use of diagonal controller gains can decouple For the y axis, we have a similar dynamic equation as two axes, but there still are coupled effects such as M y ( mgLsθ y + bθ y θy ) − mLcθ y ( mLsθ yθy2 − b y x + u y ) the coriolis force and other unknown nonlinear terms. θy = As a result the linear controller cannot cope with sys- M y J ′ − m 2 L2 c 2θ y tem parameter variations well enough, resulting in (8) the failure of position control of the cart [6]. Exten- sive simulation studies have been done. J ′( mLsθ yθy2 − b y x + u y ) − mLcθ y ( mLgsθ y + bθ y θy ) In order to improve the control performance, two y= neural networks are used. Since the neural network is M y J ′ − m 2 L2 c 2θ y nonlinear it is a good candidate for nonlinear system (9) control. A decoupled neural network structure for Note that M y includes the mass of an x axis be- controlling each axis separately helps the system to be more decoupled. Fig. 5 indicates the decoupled cause the y axis carries the x axis at all times. neural network structure for a single axis. 3. PD CONTROL OF INVERTED PENDU- 4. REFERENCE COMPENSATION TECH- LUM NIQUE FOR NN CONTROL In this section, the non-model based PD control is In this section, one of the on-line learning algo- presented. It is known that a PD control can stabilize rithms for neural network control is presented. The the second order system. Even though we have de- algorithm called the reference compensation tech- rived the dynamic equations in the previous section, nique has been proposed and it has shown good per- it is very difficult to obtain the exact dynamic equa- formances in the robot position control [11, 12]. tions including uncertain nonlinear terms. Therefore, This scheme is identical to the feedback error ignoring the system dynamics, the most uncompli- learning method in that it performs inverse dynamic cated method is to use the simple linear PD control- control, but it is also different in that compensation is lers as the main controllers. done without modifying pre-fixed linear controllers. Fig. 3 depicts the PD control structure of controlling This control scheme is depicted in Fig. 4. The basic the inverted pendulum in one axis. In order to control
International Journal of Control, Automation, and Systems Vol. 2, No. 1, March 2004 95 NN Compensator PD Controller Pendulum System qr ε τ q qd Neural Network Controller Σ Controller Plant θd θd ν Φ θr uθ f θ θ (t ) Σ Σ K Pθ Σ Σ Cart θ θr & P θ (t − 1) F.D Σ Σ K Dθ Pole ε Σ Neural uP P Network Pr P (t ) Σ Σ K PP Σ Pr P (t − 1) F.D Σ Σ K DP Pd Pd Fig. 5. Neural network control block diagram for a sin- Fig. 4. Reference compensation technique scheme. gle axis. concept of this scheme is that the NN controller acts . as the inverse of the system under PD control so that Φ x = k pxφ px + kdx φ px the system response q tracks the desired response Also note that the compensating signal Φ x qd with minimal distortion. Neural networks are solves the sign problem between uθ x and u px . placed in front of the closed loop controlled system Since compensation is performed at the trajectory as pre-filters as seen in Fig. 4. Neural network out- level, those compensating signals are amplified puts are added to reference trajectories. Added terms through controller gains so that the magnitudes of are subtracted by output signals to generate error sig- those pre-filtered signals are small compared with nals ε . The errors are multiplied by controller gains. ones in other auxiliary type controllers [5]. In the Therefore, they eventually shape the reference in- paper [5], the neural network is used to adjust PID put trajectory qr in such a way that the output er- controller gains. The goal of minimizing error is the ror ε is minimized to zero. same with the proposed RCT algorithm, but the dif- Our proposed control block diagram is shown in ference comes from the controlling structure. The Fig. 5. The inverted pendulum system is controlled RCT algorithm is known to have the advantage of by PD control and neural network control. outer loop control without modifying the internal From Fig. 5, the PD controller with compensating control structure [11, 12]. signals from a neural network forms the following control inputs: 5. LEARNING ALGORITHM . uθ x = k pθ (eθ + φθ ) + kdθ (eθ + φ θ ) , (13) The neural network structure for a single axis is shown in Fig. 6. A two layered feed-forward structure . is used. Input patterns are the combination of position u px = k px (ex + φ p ) + kdx (ex + φ p ) . (14) and angle errors. For experiments, 9 hidden units are used. Selection of the number of hidden units is . . based on trial and error. Delayed states are used as The derivatives φ θ and φ p of neural network inputs of a neural network to give dynamics into the outputs φθ and φ p are obtained by the finite differ- neural network. Two neural network outputs are used, one for compensating an angle error and the other for ence method, respectively. a position error are used for a single axis. The total control input for one axis is the sum of Nonlinear function of hidden unit is tangent hyper- (13) and (14). bolic function. _ _ u x = uθ x + u px + Φ x . (15) 1 − exp(− x) f ( x) = . (17) 1 + exp(− x) The new control input is actually the addition of a neural network compensating signal to nominal con- From (4), (5), and (15), we have the following trol input u x . These compensating terms compen- closed loop error equation: sate for any dynamic uncertainties that are not mod- _ _ eled. u θ x + u p x = f (θ, θ , θ , x , x , x ) − Φ x . (18) Define the neural network output as Define the training signal v with nominal con- Φ x = Φθ x + Φ px , (16) trol inputs as . where Φθ x = k pθ φ θ x + kdθ φθ x and
96 International Journal of Control, Automation, and Systems Vol. 2, No. 1, March 2004 bj yields the gradient as Oj Oi 1 bk ∂E ∂E ∂v ∂v 1 = =v θ (t ) ∂w ∂v ∂w ∂w Ok (22) θ (t − 1) ∂Φ ∂Φ ∂Φ px φθ = −v = −v( θ x + ), θ (t − 2 ) ∂w ∂w ∂w p (t ) ∂Φθ x ∂φ ∂Φ px ∂φ px φp where = k pθ θ x and = k px . p (t − 1) ∂w ∂w ∂w ∂w p (t − 2 ) By using the gradient function in (22) the back- w jk propagation algorithm can be used. The weight wij change is formed as Fig. 6. The structure of neural network for a single ∂E ∆w(t ) = η v + α∆w(t − 1) . (23) axis. ∂w Weights are updated as w(t + 1) = w(t ) + ∆w(t ) , (24) where η is a learning rate and α is a momentum coefficient. Neural network weights are updated at every sam- pling time. Since rapid sampling time can be achieved with the help of DSP hardware technology, real time control of a neural network becomes possi- ble. Even on-line learning and control of the system can be performed. This means that no a priori learn- ing before control action is required. In the next section, experimental studies of neural network control are presented. 6. EXPERIMENTAL SETUPS The overall system structure is shown in Fig. 7. The whole system consists of three parts: an x-y table robot, the dSpace DSP system, and a PC. The DSP board is used to implement the neural network algo- Fig. 7. Real inverted pendulum system. rithm in real time, which requires a huge calculation time. Numerical values of neural network outputs _ _ calculated by the DSP board are converted to PWM v = uθ x + u px . (19) signals for motor drivers. The DC motors driven by its own drivers com- Then, at the convergence v = 0 , from equation manded from the DSP board mounted actually actu- (18) an ideal neural network output becomes the in- ate each axis of the x-y table. Encoders mounted on verse dynamics of the system. each axis sense rotations and those sensed data are sent to the encoder board. An encoder board collects Φ ≅ f (θ, θ, θ , x, x , x) . (20) data and converts them into digital values that can be So, ultimately, the inverse dynamic control can be used as a feedback in the control loop. Table 1 shows achieved. This is the difference in the control struc- the specifications of actuators. ture from those of other PID tuning methods [5]. Table 1. Specification of actuators. The objective function is defined to minimize the error v as Devices Specifications 1 2 DC motor 24V/70W E = v . (21) 2 Encoder 2000 counts/turn In order to use the steepest decent algorithm, the Gear ratio 5:1 gradient should be obtained. Differentiating (21)
International Journal of Control, Automation, and Systems Vol. 2, No. 1, March 2004 97 Movement of pendulum on x-y plane y axis pendulum angle error 0.08 0.025 0.07 0.02 0.06 0.05 0.015 Angle error of y axis (rad) 0.04 y axis (m) 0.01 0.03 0.02 0.005 0.01 0 0 -0.01 -0.005 -0.02 -0.12 -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 -0.01 x axis (m) 0 10 20 30 40 50 60 time (sec) Fig. 8. Movement of pendulum on x-y plane. Fig. 10. Y axis angle error. x axis pendulum angle error 0.02 control to neural networks after stabilizing the system by PD controllers. 0.015 For the neural network structure, 9 hidden units for 0.01 each neural network are used. Learning rate η x = Angle error of x axis (rad) 0.0011, η y = 0.001 and momentum α x = 0.15, α y = 0.005 0.05 are optimized. These constant values are opti- 0 mized by trial and error basis. We found that the learning rate is the most sensitive variable to per- -0.005 formance. Selecting a larger learning rate gives faster convergence of errors as well as occasional instabili- -0.01 ties. The overall sampling time is 1 KHz. -0.015 0 10 20 30 40 50 60 7. EXPERIMENTAL RESULTS time (sec) Fig. 9. X axis angle error. Initially the pendulum is located at (0, 0) on the x- y plane. The pendulum is well maintained at the ini- The weight of the pendulum is 0.35kg without a tial position until there is an external hit. Then the weight and 0.5Kg with a weight at the top. movement of the pendulum is affected by an external There are two encoders mounted on the cart to disturbance by the hand. The pendulum is required to measure the rotational angles of both axes. The in- move back to the initial position. verted pendulum can fall in any direction while a cart Fig. 8 shows the movement of the pendulum on the can move on the x-y plane. The movement of the cart x-y plane after following hits. is measured by encoders attached to the motor joint. We can see that there have been two disturbances The x-y table is actuated by DC motors through tim- by hitting the pendulum in diagonal directions. The ing belts. Since the x axis cart moves on the top of y pendulum has kept returning to the initial position axis, there are coupling effects, and this leads to the while maintaining the balance of the angle of the reason that control of the y axis is more difficult. pendulum shown in Fig. 9. Traveling distances are For the controller gains, k dθ x = 0.8, k pθ x = 6, from - 10.5 cm to 6 cm in the x axis and from -1.5 cm to 4 cm in the y axis. The pendulum angle errors for k dpx = -0.6, k ppx = -0.5 for x axis and k dθ y = 1.4, both axes are shown in Figs. 9 and 10 for each direc- tion. Large overshoots by disturbances are observed k pθ y = 8.5, kdpy = -0.95, k ppy = -0.8 for y axis are at approximately both 13 seconds and 29 seconds and used. The PD gains are optimized by trial errors basis. they settled down quickly. However, gain values are small enough to maintain The positional errors are shown in Figs. 11 and 12. stability so that neural networks are allowed to per- Comparing performances of two axes' movements, form the most of control. We found from experiments the x axis position error is less than 2 mm at steady that if the PD gains are set too large, performance is state, but that of the y axis is about 2 cm. The y axis even worse. As such, it is better to leave the keeps oscillating with a small bound.
98 International Journal of Control, Automation, and Systems Vol. 2, No. 1, March 2004 Position error of pendulum on x axis Even though balancing of the pendulum was quite 0.12 successful, controlling the cart position shows small 0.1 oscillatory errors in the y axis. One of the reasons 0.08 might be the location of optical encoders. Shown in 0.06 Fig. 7, encoders are mounted at the axis of rotating Position error of x axis (m) rod, not directly at the motor axis. 0.04 Another possibility is the velocity estimation. Sim- 0.02 ple velocity estimation using the finite difference 0 might not provide good estimation for slow move- -0.02 ment of the pendulum. Finally, the y axis is actuated by one side and not the other, which leads to the mis- -0.04 alignment that causes oscillation. This yields further -0.06 nonlinear characteristics such as backlash. These are -0.08 left as future works to be performed. 0 10 20 30 40 50 60 time (sec) REFERENCES Fig. 11. X axis position error. [1] S. Omatu, Y. Kishida, and M. Yoshioka, Position error of pendulum on y axis “Neuro-control for single-input multi-output 0.02 systems,” Proc. of IEEE Conf. on Knowledge 0.01 Based Intelligent Electronics Systems, pp. 202- 0 205, 1998. [2] I. Fantoni and R. Lozano, Non-linear Control -0.01 Position error of y axis (m) for Under-actuated Mechanical Systems, -0.02 Springer, 2002. -0.03 [3] M. W. Spong, “Swing up control of the acro- -0.04 bat,” Proc. of IEEE Conf. on Robotics and Automations, pp. 2356-2361, 1994. -0.05 [4] S. Jung and S. B. Yim, “Reference compensa- -0.06 tion technique using neural network for control- -0.07 ling a large x-y table robot,” International Sym- -0.08 posium on Robotics and Automations, pp. 461- 0 10 20 30 40 50 60 time (sec) 455, 2000. Fig. 12. Y axis position error. [5] S. Omatu, T. Fujinaka, and M. Yoshioka, “Neuro-pid control for inverted single and dou- The error eventually converges to zero, but it takes ble pendulums,” IEEE Conference, pp. 2685- time. One reason for this is that the y axis control is 2690, 2000. more difficult than the x axis control; because the y [6] S. Jung and S. B. Yim, “Experimental studies of axis continually carries the x axis and coupling ef- neural network control for nonlinear systems,” fects are well aware of this factor, requiring addi- ICASE journal of Control, Automation and Sys- tional torques to generate faster movements. tems Engineering, pp. 918-926, 2001. A further matter is one side actuation through tim- [7] T. Lahdhiri, C. Carnaland, and A. T. Alouani, ing belts. This causes minimal jerk by unbalanced “Cart-pendulum balancing problem using fuzzy movements to each axis even though an LM guide is logic control,” Proc. of IEEE Southeastcon’94, used. However, position tracking of the cart as well pp. 393-397, 1994. as upright position of the pendulum is satisfactory. [8] T. Hung, M. Yeh, and H. C. Lu, “A PI-like fuzzy controller implementation for the inverted pen- 8. CONCLUSIONS dulum system,” Proc. of IEEE Conference on Intelligent Processing Systems, pp. 218-222, The reference compensation technique of the neu- 1997. ral network for balancing a two degrees-of-freedom [9] M. E. Magana and F. Holzapfel, “Fuzzy logic PD controlled inverted pendulum has been presented. control of an inverted pendulum with vision The DRCT was very effective in balancing the feedback,” IEEE Trans. on Education, vol. 41, nonlinear inverted pendulum on the x-y plane by de- no. 2, pp. 165-170, 1998. coupling the x and y axis. The neural compensator [10] L. Wenzel, N. Vazquez, and ND. N. Jamal, helps conventional PD controllers to control the angle “Computer vision based inverted pendulum,” of the pendulum and the position of the cart simulta- Proc. of IEEE Conference on Instrumentation neously. and Measurement Technology, pp. 1319-1323,
International Journal of Control, Automation, and Systems Vol. 2, No. 1, March 2004 99 vol. 3, 2000. [11] S. Jung and T. C. Hsia, “On an effective design approach of cartesian space neural network con- trol of robot manipulators,” Robotica, vol. 15, Part 3, pp. 305-312, 1997. [12] S. Jung and T. C. Hsia, “Neural network inverse control techniques for PD controlled robot ma- nipulators,” Robotica, pp. 461-455, vol. 19, no. 3, 2000. Seul Jung received the B.S. degree in Electrical & Computer Engineering from Wayne State University in 1988, and the M.S and Ph.D. degrees from the University of California, Davis in 1991 and 1996, respectively. After a post doctoral position at the AHMCT (Advanced Highway Maintenance and Construction Technology) of Univer- sity of California, Davis, he joined the Chungnam National University in 1997. He is currently an Associate Professor at the Department of Mechatronic Engineering, Chungnam National University. His research interests include a hard- ware implementation of intelligent controllers, intelligent emotional engineering, and intelligent robotic systems. Hyun-Taek Cho received the B.S. and M.S. degrees in Mechatronics Engineering from Chungnam National University, in 2000 and 2003, respec- tively. He is now a Researcher in the Department of Intelligent Precision Machines at KIMM. His research interests involve neural network con- trollers and DSP systems.
You can also read