%0 Journal Article
%T Decision Making in Reinforcement Learning Using a Modified Learning Space Based on the Importance of Sensors
%A Yasutaka Kishima
%A Kentarou Kurashige
%A Toshihisa Kimura
%J Journal of Sensors
%D 2013
%I Hindawi Publishing Corporation
%R 10.1155/2013/141353
%X Many studies have been conducted on the application of reinforcement learning (RL) to robots. A robot which is made for general purpose has redundant sensors or actuators because it is difficult to assume an environment that the robot will face and a task that the robot must execute. In this case, -space on RL contains redundancy so that the robot must take much time to learn a given task. In this study, we focus on the importance of sensors with regard to a robot＊s performance of a particular task. The sensors that are applicable to a task differ according to the task. By using the importance of the sensors, we try to adjust the state number of the sensors and to reduce the size of -space. In this paper, we define the measure of importance of a sensor for a task with the correlation between the value of each sensor and reward. A robot calculates the importance of the sensors and makes the size of -space smaller. We propose the method which reduces learning space and construct the learning system by putting it in RL. In this paper, we confirm the effectiveness of our proposed system with an experimental robot. 1. Introduction In recent years, reinforcement learning (RL) [1] has been actively studied, and many studies on its application to robots have been conducted [2每4]. A matter of concern in RL is the learning time. In RL, information from sensors is projected onto a state space. A robot learns the correspondence between each state action in the state space and determines the best correspondence. When the state space expands according to the number of sensors, the number of correspondences learned by the robot is also increased. In addition, the robot needs considerable much experience in each state to perform a task. Therefore, learning the best correspondence becomes time-consuming. To overcome this problem, many studies have investigated accelerated RL [5每15] for which there are two approaches: a multirobot system and autonomous construction of the state space. In the former approach, multiple robots exchange experience information [5每9], so that each robot augments its own knowledge. Therefore, in this system robots can find the best correspondence between each state and action faster than an individual robot in a single-robot system. In addition Nishi et al. [10] proposed a learning method in which a robot learns behavior through observations of the behavior of other robots, constructing its own relationships between state and behavior. However, in this approach, a robot needs other robots with whom to exchange experience information, and hence,
%U http://www.hindawi.com/journals/js/2013/141353/