Latent State Space Quantization for Learning and Exploring Goals
Аннотация
Mastering multiple goals across diverse domains requires a challenging learning process to be completed by artificial agents. A crucial aspect of this challenge lies in effectively exploring both the goal and state spaces. We introduce an agent, LAtent QUantized eXplorative Achiever (LAQUAXA), that enables structured exploration process constructing the world model with two improving features: learnable first state distribution and quantization of latent state space. The first one enhances world model learning process. The second defines a finite set of goals that used to be achieved to explore the environment. During training episodes, LAQUAXA selects and achieves one of these goals, which are already familiar to the agent. Subsequently, the agent initiates exploration behavior through an intrinsically motivated policy. This dual-phase approach enables the agent to efficiently cover the maximum volume of seen states while simultaneously exploring to acquire new knowledge about the environment. Through experiments, we demonstrate the effectiveness of the proposed framework in multi-goal learning across diverse domains: continuous and discrete mazes. We found that our approach surpasses baseline methods where goals are solely set from the replay buffer and the initial distribution does not encounter the first observation.
Похожие публикации
сотрудничества и партнерства