ru

About
Publications
Blog
Careers

ru

Source

SISY

DATE OF PUBLICATION

02/13/2023

Authors

Artem Sayapin Ilya Makarov

Share

Multi-Modal Deep Reinforcement Learning in ViZDoom with Audio Component

Abstract

In this work, the domain of audiovisual reinforcement learning is discussed in the context of ViZDoom environment with the addition of audio component. Two models with audiovisual features were compared: Asynchronous Proximal Policy Optimization (APPO) and Importance Weighted Actor-Learner Architecture (IMPALA). We trained the agents in two different scenarios for ViZDoom environment: Music Recognition and Duel. Agents learned to play in Duel scenario, while they achieve stable performance in Music recognition scenario. IMPALA managed to outperform APPO in Duel scenario, while APPO showed twice better results than IMPALA in Music Recognition scenario. Both agents are not able to achieve decent results in Music Recognition task and future research with provided directions of improvement could be made.

Full text

Similar publications

EAI: Emotional Decision-Making of LLMs in Strategic Games and Ethical Dilemmas

Mikhail Mozikov, Nikita Severin, Valeria Bodishtianu, Maria Glushanina, Ivan Nasonov, Daniil Orekhov, Vladislav Pekhotin, Ivan Makovetskiy, Mikhail Baklashkin, Vasily Lavrentyev, Akim Tsvigun, Denis Turdakov, Tatyana Shavrina, Andrey Savchenko, Ilya Makarov

SOURCE

SODAOpt: Socio-Demographic and Textual Adaptive Fusion for Optimizing Developer Task Assignment

Karina Romanova, Sergey Senichev, Lina Veltman, Ivan Nasonov, Andrey Kuznetsov, Ilya Makarov

SOURCE

Poster Abstract: Minimizing Labeling Efforts for Fault Detection and Diagnosis

Maria Shtark, Alexander Kozhevnikov, Petr Ivanov, Ilya Makarov

SOURCE

Poster Abstract: Exploring the Autoencoder Sequence Pooling

Petr Ivanov, Maria Shtark, Alexander Kozhevnikov, Ilya Makarov

SOURCE

Poster Abstract: Autonomous AI-Driven Grid Protection: Sub-Cycle Fault Response via NPU-Optimized Neural Networks

Alexander Kovalenko, Aleksey Evdakov, Galina Filatova, Andrey Yablokov, Aleksandr Bulashov, Ilya Makarov

SOURCE

Enhancing Emotion Recognition in Speech based on Self-Supervised Learning: Cross-Attention Fusion of Acoustic and Semantic Features

Bashar M. Deeb, Andrey Savchenko, Ilya Makarov

SOURCE

Closing the Domain Gap in Manga Colorization via Aligned Paired Dataset

Maksim Golyadkin, I. Plevokas, Ilya Makarov

SOURCE

AIRI Institute

You can ask us a question or suggest a joint project in the field of AI

About
Publications
Blog
Careers

event@airi.net

For events invitations

partner@airi.net

For scientific cooperation and
partnership

pr@airi.net

For journalists and media

people@airi.net

For any questions connected with
employees and employment

© 2025, AIRI

Join AIRI

Name Email Your message I'm not a robot By submitting the form, I consent to the processing of my personal data

Message sent.

Thank you!

Something went wrong. Try again

About
- Values
- Numbers
- Focus areas
- Research
- Partners
- Management
- Contacts
Publications
Blog
Careers

Contact us

Join AIRI

You can ask us a question or suggest a joint project in the field of AI

Name Email Your message I'm not a robot By submitting the form, I consent to the processing of my personal data

Message sent.

Thank you!

Something went wrong. Try again

partner@airi.net

For scientific cooperation and
partnership

pr@airi.net

For journalists and media