TY - JOUR
T1 - Learning unsupervised disentangled skill latents to adapt unseen task and morphological modifications
AU - Kim, Taewoo
AU - Yadav, Pamul
AU - Suk, Ho
AU - Kim, Shiho
N1 - Publisher Copyright:
© 2022 The Author(s)
PY - 2022/11
Y1 - 2022/11
N2 - Learning adaptable policies in the absence of explicit reward signals is a challenging problem in reinforcement learning. We propose an algorithm that disentangles the morphology-aware variables from skill latent space for adapting quickly to unseen morphological changes. Through several task learning experiments using MuJoCo Ant environments, we demonstrate that the agent can perform zero-shot inference and adapt to mild modification in morphology within an expected performance range. Furthermore, in the case of severe unseen morphological damage to the agent's body, with the help of add-on training steps, we can subjoin additional value incorporated into the disentangled latent space without catastrophically destroying the pre-trained network. We observe that the proposed separable-skill based method outperforms prior evolutionary meta-learning-based approaches, and the presented approach opens up research direction toward reinforcement learning for open-world novelty. Our source code is available at:https://github.com/boratw/sd4m.
AB - Learning adaptable policies in the absence of explicit reward signals is a challenging problem in reinforcement learning. We propose an algorithm that disentangles the morphology-aware variables from skill latent space for adapting quickly to unseen morphological changes. Through several task learning experiments using MuJoCo Ant environments, we demonstrate that the agent can perform zero-shot inference and adapt to mild modification in morphology within an expected performance range. Furthermore, in the case of severe unseen morphological damage to the agent's body, with the help of add-on training steps, we can subjoin additional value incorporated into the disentangled latent space without catastrophically destroying the pre-trained network. We observe that the proposed separable-skill based method outperforms prior evolutionary meta-learning-based approaches, and the presented approach opens up research direction toward reinforcement learning for open-world novelty. Our source code is available at:https://github.com/boratw/sd4m.
KW - Disentangled latent space
KW - Reinforcement learning
KW - Skill-based policy
KW - Unsupervised skill discovery
UR - http://www.scopus.com/inward/record.url?scp=85137169406&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137169406&partnerID=8YFLogxK
U2 - 10.1016/j.engappai.2022.105367
DO - 10.1016/j.engappai.2022.105367
M3 - Article
AN - SCOPUS:85137169406
SN - 0952-1976
VL - 116
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
M1 - 105367
ER -