DSpace width= university logo mark
Japanese | English 

KURA > B. 理工学域・研究域/理工学部/自然科学研究科 > b10. 学術雑誌掲載論文 > 1.査読済論文(工) >


ファイル 記述 サイズフォーマット
TE-PR-SENDA-K-IROS04_3732.pdf452.34 kBAdobe PDF
タイトル: Reinforcement learning accelerated by using state transition model with robotic applications
著者: Senda, Kei link image
Fujii, Shinji
Mano, Syusuke
泉田, 啓
発行日: 2004年 9月
出版社(者): IEEE
引用: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 4, pp. 3732-3737
雑誌名: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
巻: 4
開始ページ: 3732
終了ページ: 3737
キーワード: State transition model
抄録: This paper discusses a method to accelerate reinforcement learning. Firstly defined is a concept that reduces the state space conserving policy. An algorithm is then given that calculates the optimal cost-to-go and the optimal policy in the reduced space from those in the original space. Using the reduced state space, learning convergence is accelerated. Its usefulness for both DP (dynamic programing) iteration and Q-learning are compared through a maze example. The convergence of the optimal cost-to-go in the original state space needs approximately N or more times as long as that in the reduced state space, where N is a ratio of the state number of the original space to the reduced space. The acceleration effect for Q-learning is more remarkable than that for the DP iteration. The proposed technique is also applied to a robot manipulator working for a peg-in-hole task with geometric constraints. The state space reduction can be considered as a model of the change of observation, i.e., one of cognitive actions. The obtained results explain that the change of observation is reasonable in terms of learning efficiency.
URI: http://hdl.handle.net/2297/1847
資料種別: Journal Article
権利関係: ©2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.” 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 4, 2004,pp. 3732-3737
版表示: publisher

このアイテムを引用あるいはリンクする場合は次の識別子を使用してください。 http://hdl.handle.net/2297/1847



Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - ご意見をお寄せください