Deep Reinforcement Learning for Continuous Action Control

Yang, Zhaoyang

doi:10.26190/unsworks/20144

Deep Reinforcement Learning for Continuous Action Control

Download files

Access & Terms of Use

open access
Copyright: Yang, Zhaoyang

CC BY-NC-ND 3.0

Abstract

Deep reinforcement learning has greatly improved the performance of learning agent by combining the strong generalization and extraction ability of deep learning models with the bootstrapping nature of reinforcement learning. Many works have achieved unprecedented results, especially on discrete action control tasks. However, much less work has been done to deal with robotic control in a continuous action space. All single-thread based algorithms in this domain can only control the robot to solve basic tasks, and only one task at a time. This thesis aims to make up for these limitations. We first proposed a novel deep reinforcement learning network architecture that can reduce the number of parameters needed for learning a single basic skill in a continuous action space by more than 70%. We then proposed a novel multi-task deep reinforcement learning algorithm to learn multiple basic tasks simultaneously. It makes use of the proposed network architecture to reduce the number of parameters needed for learning multiple tasks by more than 80%. Finally, we proposed a novel hierarchical deep reinforcement learning algorithm which consists of two levels of hierarchy. It adapted the proposed multi-task learning algorithm in its first level of hierarchy to learn multiple basic skills and then learn to reuse these skills in its second level of hierarchy to solve compound tasks. We conducted several sets of experiments to test both the proposed network architecture and the algorithms with a simulated Pioneer 3AT robot in Gazebo 2 in a ROS Indigo environment. Results show that agents built with the proposed network architecture can learn skills that are as good as the ones learned by agents built with traditional convolutional neural networks. Also, all basic skills learned by the proposed multi-task learning algorithm achieve comparable performance to the skills learned independently by single-task learning algorithm. Results also show that the proposed hierarchical learning algorithm can learn both high performance basic skills and compound skills within the same learning process. The performance of the proposed algorithm on solving compound tasks has outperforms both a state-of-the-art single-thread based continuous action control algorithm and a well-known discrete action control algorithm.

Persistent link to this record

http://hdl.handle.net/1959.4/59035

DOI

https://doi.org/10.26190/unsworks/20144

Author(s)

Yang, Zhaoyang

Supervisor(s)

Kasmarik, Kathryn

Abbass, Hussein

Publication Year

2017

Resource Type

Thesis

Degree Type

Masters Thesis

UNSW Faculty

Files

public version.pdf

5.49 MB

Adobe Portable Document Format

View full record Show statistics

Library

Deep Reinforcement Learning for Continuous Action Control

Access & Terms of Use

Altmetric

Abstract

Persistent link to this record

DOI

Link to Publisher Version

Link to Open Access Version

Additional Link

Author(s)

Supervisor(s)

Creator(s)

Editor(s)

Translator(s)

Curator(s)

Designer(s)

Arranger(s)

Composer(s)

Recordist(s)

Conference Proceedings Editor(s)

Other Contributor(s)

Corporate/Industry Contributor(s)

Publication Year

Resource Type

Degree Type

UNSW Faculty

Files

Related dataset(s)