Deep Reinforcement Learning for Continuous Action Control

Download files
Access & Terms of Use
open access
Copyright: Yang, Zhaoyang
Altmetric
Abstract
Deep reinforcement learning has greatly improved the performance of learning agent by combining the strong generalization and extraction ability of deep learning models with the bootstrapping nature of reinforcement learning. Many works have achieved unprecedented results, especially on discrete action control tasks. However, much less work has been done to deal with robotic control in a continuous action space. All single-thread based algorithms in this domain can only control the robot to solve basic tasks, and only one task at a time. This thesis aims to make up for these limitations. We first proposed a novel deep reinforcement learning network architecture that can reduce the number of parameters needed for learning a single basic skill in a continuous action space by more than 70%. We then proposed a novel multi-task deep reinforcement learning algorithm to learn multiple basic tasks simultaneously. It makes use of the proposed network architecture to reduce the number of parameters needed for learning multiple tasks by more than 80%. Finally, we proposed a novel hierarchical deep reinforcement learning algorithm which consists of two levels of hierarchy. It adapted the proposed multi-task learning algorithm in its first level of hierarchy to learn multiple basic skills and then learn to reuse these skills in its second level of hierarchy to solve compound tasks. We conducted several sets of experiments to test both the proposed network architecture and the algorithms with a simulated Pioneer 3AT robot in Gazebo 2 in a ROS Indigo environment. Results show that agents built with the proposed network architecture can learn skills that are as good as the ones learned by agents built with traditional convolutional neural networks. Also, all basic skills learned by the proposed multi-task learning algorithm achieve comparable performance to the skills learned independently by single-task learning algorithm. Results also show that the proposed hierarchical learning algorithm can learn both high performance basic skills and compound skills within the same learning process. The performance of the proposed algorithm on solving compound tasks has outperforms both a state-of-the-art single-thread based continuous action control algorithm and a well-known discrete action control algorithm.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Yang, Zhaoyang
Supervisor(s)
Kasmarik, Kathryn
Abbass, Hussein
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2017
Resource Type
Thesis
Degree Type
Masters Thesis
UNSW Faculty
Files
download public version.pdf 5.49 MB Adobe Portable Document Format
Related dataset(s)