Interpreting Decision Process in Offline Reinforcement Learning for Interactive Recommendation Systems

Source

ICONIP

DATE OF PUBLICATION

11/26/2023

Authors

Alexander Panov

Peter Kuderov

Zoya Volovikova

Abstract

Recommendation systems, which predict relevant and appealing items for users on web platforms, often rely on static user interests, resulting in limited interactivity and adaptability. Reinforcement Learning (RL), while providing a dynamic and adaptive approach, brings its unique challenges in this context. Interpreting the behavior of an RL agent within recommendation systems is complex due to factors such as the vast and continuously evolving state and action spaces, non-stationary user preferences, and implicit, delayed rewards often associated with long-term user satisfaction.

Addressing the inherent complexities of applying RL in recommendation systems, we propose a framework that includes innovative metrics and a synthetic environment. The metrics aim to assess the real-time adaptability of an RL agent to dynamic user preferences. We apply this framework to LastFM datasets to interpret metric outcomes and test hypotheses regarding MDP setups and algorithm choices by adjusting dataset parameters within the synthetic environment. This approach illustrates potential applications of our framework, while highlighting the necessity for further research in this area.

Full text