Source

KDD

DATE OF PUBLICATION

02/15/2024

Authors

Valeriy Shevchenko Nikita Belousov Alexey Vasilev Vladimir Zholobov Artyom Sosedka Natalia Semenova Anna Volodkevich Andrey Savchenko Alexey Zaytsev

From Variability to Stability: Advancing RecSys Benchmarking Practices

Recommender Systems, Evaluation, Benchmarking, Datasets, Data Characteristics

Abstract

In the rapidly evolving domain of Recommender Systems (RecSys), new algorithms frequently claim state-of-the-art performance based on evaluations over a limited set of arbitrarily selected datasets. However, this approach may fail to holistically reflect their effectiveness due to the significant impact of dataset characteristics on algorithm performance. Addressing this deficiency, this paper introduces a novel benchmarking methodology to facilitate a fair and robust comparison of RecSys algorithms, thereby advancing evaluation practices. By utilizing a diverse set of 30 open datasets, including two introduced in this work, and evaluating 11 collaborative filtering algorithms across 9 metrics, we critically examine the influence of dataset characteristics on algorithm performance. We further investigate the feasibility of aggregating outcomes from multiple datasets into a unified ranking. Through rigorous experimental analysis, we validate the reliability of our methodology under the variability of datasets, offering a benchmarking strategy that balances quality and computational demands. This methodology enables a fair yet effective means of evaluating RecSys algorithms, providing valuable guidance for future research endeavors.

Full text DOWNLOAD pdf