http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Masanao Obayashi,Kenichiro Narita,Takashi Kuremoto,Kunikazu Kobayashi 제어로봇시스템학회 2008 제어로봇시스템학회 국제학술대회 논문집 Vol.2008 No.10
Human learns incidents by own actions and reflects them on the subsequent action as own experiences. These experiences are memorized in his brain and recollected if necessary. This research incorporates such an intelligent information processing mechanism, and applies it to an autonomous agent that has three main functions: learning, memorization and associative recollection. In the proposed system, an actor-critic type reinforcement learning method is used for learning. Auto-associative chaotic neural network is also used like mutual associative memory system. Moreover, the memory part has an adaptive hierarchical layered structure of the memory module that consists of chaotic neural networks in consideration of the adjustment to non-MDP (Markov Decision Process) environment. Finally, the effectiveness of this proposed method is verified through the simulation applied to the maze-searching problem.
Liangbing Feng,Masanao Obayashi,Takashi Kuremoto,Kunikazu Kobayashi 제어로봇시스템학회 2010 제어로봇시스템학회 국제학술대회 논문집 Vol.2010 No.10
A hybrid intelligent control system model which combines high-level time Petri net (HLTPN) and Reinforcement Learning (RL) is proposed. In this model, the control system is modeled by HLTPN and system state last time is presented as transitions delay time. For optimizing the transition delay time through learning, a value item is appended to delay time of transition for recording the reward from environment and this value is learned using Q-learning ?. a kind of RL. Because delay time of transition is continuous, two RL algorithms in continuous space methods are used in Petri Net learning process. Finally, for the purpose of certification of the effectiveness of our proposed system, it is used to model a guide dog robot system which system environment is constructed using radio-frequency identification (RFID). The result of the experiment shows the proposed method is useful and effective.
Robust Reinforcement Learning Control System with H<SUB>∞</SUB> Tracking Performance Compensator
Shogo Uchiyama,Masanao Obayashi,Takashi Kuremoto,Kunikazu Kobayashi 제어로봇시스템학회 2011 제어로봇시스템학회 국제학술대회 논문집 Vol.2011 No.10
Robust control theory generally guarantees robustness and stability of the closed-loop system, however it requires mathematical model of the system to design the control system. Therefore, it can’t often deal with nonlinear systems because of difficulty of modeling of the system. Other, reinforcement learning method can deal with the nonlinear system without mathematical model, however, it usually doesn’t guarantee the stability of control. In this paper, we propose a “Robust Reinforcement Learning Control System (RRLCS)“ through combining reinforcement learning to treat unknown nonlinear systems and using robust control theory to guarantee the robustness and stability of the system. As a robust control method, we adopt H∞ control which is robust to modelling error and disturbance. On the other hand, as a reinforcement learning method, we adopt an Actor-Critic method with minimal amount of computation for the continuous action and state space. Moreover, we analyze the stability of the proposed system using H∞ tracking performance and Lyapunov function. Finally, through the computer simulation for controlling the inverted pendulum system, we show the effectiveness of the proposed method comparing with an Adaptive Fuzzy Control method with H∞ tracking performance compensator (AFC) and an Auto-Structuring Fuzzy Neural Control System method (ASFNCS).
Masanao Obayashi,Tomohiro Nishida,Takashi Kuremoto,Kunikazu Kobayashi,Liang-Bing Feng 제어로봇시스템학회 2010 제어로봇시스템학회 국제학술대회 논문집 Vol.2010 No.10
This paper concerns about a way of intellectualization of robots (called "agent" here). Human learns incidents by own actions and reflects them on the subsequent actions as own experiences. These experiences are memorized in his/her brain and recollected and reused if necessary. This research incorporates such an intelligent information processing mechanism, and applies it to an autonomous agent that has three main functions that is, learning, memorization and associative recollection. In the proposed system, an actor-critic type reinforcement learning method is used for learning. For memorization, we introduce the chaotic auto-associative model that is proposed by Chartier, and that is also used like mutual associative memory system. Moreover, to deal with the increase of information, the memory part has an adaptive hierarchical layered structure of the memory module that consists of chaotic neural networks, especially for multi-valued pattern. Finally, the effectiveness of this proposed method is verified through the simulation applied to the maze-searching problem.
Intelligent Tracking Control Method of A Target by Group of Agents with Nonlinear Dynamics
Masanao Obayashi,Yuuki Yokoji,Shogo Uchiyama,Liangbing Feng,Takashi Kuremoto,Kunikazu Kobayashi 제어로봇시스템학회 2011 제어로봇시스템학회 국제학술대회 논문집 Vol.2011 No.10
This paper proposes the intelligent tracking control of the target by group of agents with nonlinear dynamics. In the proposed method, agents can exchange only information of positions and the group of agents track the target based on only its position, taking a predefined formation. In the real world, the method that agents do not require a lot of information is useful for the case of existing communication delay of information and weak communication. In addition, each agent shall have the nonlinear dynamics and external disturbance. Therefore an auto- structuring fuzzy neural control system (ASFNCS) is introduced to provide appropriate control inputs while coping well with disturbance and nonlinear dynamics. Thus, while maintaining adequate control performance, the method can reduce the computational load. In the simulation, it is verified that the proposed method is useful in the point of target tracking performance through distributed agent cooperation behaviors.