ru

About
Publications
Blog
Careers

ru

Source

AAMAS

DATE OF PUBLICATION

05/19/2025

Authors

Yuriy Dorn Aleksandr Katrutsa Ilgam Latypov Andrey Pudovikov

Share

Fast UCB-type algorithms for stochastic bandits with heavy and super heavy symmetric noise

Abstract

In this study, we propose a new method for constructing UCB-type algorithms for stochastic multiarmed bandits based on general convex optimization methods with an inexact oracle. We derive theregret bounds corresponding to the convergence rates of the optimization methods. We propose anew algorithm Clipped-SGD-UCB and show, both theoretically and empirically, that in the caseof symmetric noise in the reward, we can achieve an O(log T√KT log T) regret bound insteadof OT11+α Kα1+αfor the case when the reward distribution satisfies EX∈D[|X|1+α] ⩽ σ1+α(α ∈ (0, 1]), i.e. perform better than it is assumed by the general lower bound for bandits withheavy-tails. Moreover, the same bound holds even when the reward distribution does not have theexpectation, that is, when α < 0.

Full text DOWNLOAD pdf

Similar publications

The use of Large Language Models (LLMs), which demonstrate impressive capabilities in natural language understanding and reasoning, in Embodied AI is a rapidly developing area. As a part of an embodied agent, LLMs are typically used for behavior planning

SOURCE

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Alexander Nikulin, Ilya Zisman, Alexey Zemtsov, Viacheslav Sinii, Vladislav Kurenkov , Sergey Kolesnikov

SOURCE

Communications in Nonlinear Science and Numerical Simulation

Data-driven optimal prediction with control

Aleksandr Katrutsa, Ivan Oseledets, Sergey Utyuzhnikov

SOURCE

Overview of PAN 2024: Multi-author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification Condensed Lab Overview

Abinew Ali Ayele, Nikolay Babakov, Janek Bevendorff, Xavier Bonet Casals, Berta Chulvi, Daryna Dementieva, Ashaf Elnagar, Dayne Freitag, Maik Fröbe, Damir Korenčić, Maximilian Mayerl, Daniil Moskovskiy, Animesh Mukherjee, Alexander Panchenko, Martin Potthast, Francisco Rangel, Naquee Rizwan, Paolo Rosso, Florian Schneider, Alisa Smirnova, Efstathios Stamatatos, Elisei Stakovskii, Benno Stein, Mariona Taulé, Dmitry Ustalov, Xintong Wang, Matti Wiegmann, Seid Muhie Yimam, Eva Zangerle

SOURCE

ACL / TextGraphs

Prompt Me One More Time: A Two-Step Knowledge Extraction Pipeline with Ontology-Based Verification

Alla Chepurova, Yuri Kuratov, Aydar Bulatov, Mikhail Burtsev

SOURCE

Biomedical Signal Processing and Control

Negligible effect of brain MRI data preprocessing for tumor segmentation

Ekaterina Kondratyeva , Polina Druzhinina, Alexandra Dalechina, Svetlana Zolotova, Andrey Golanov, Boris Shirokikh, Mikhail Belyaev, Anvar Kurmukov

SOURCE

User Modeling and User-Adapted Interaction

Federated privacy-preserving collaborative filtering for on-device next app prediction

Albert Sayapin, Gleb Balitskiy, Daniel Bershatsky, Aleksandr Katrutsa, Evgeny Frolov, Alexey Frolov, Ivan Oseledets, Vitaliy Kharin

SOURCE

AIRI Institute

You can ask us a question or suggest a joint project in the field of AI

About
Publications
Blog
Careers

event@airi.net

For events invitations

partner@airi.net

For scientific cooperation and
partnership

pr@airi.net

For journalists and media

people@airi.net

For any questions connected with
employees and employment

© 2025, AIRI

Join AIRI

Name Email Your message I'm not a robot By submitting the form, I consent to the processing of my personal data

Message sent.

Thank you!

Something went wrong. Try again

About
- Values
- Numbers
- Focus areas
- Research
- Partners
- Management
- Contacts
Publications
Blog
Careers

Contact us

Join AIRI

You can ask us a question or suggest a joint project in the field of AI

Name Email Your message I'm not a robot By submitting the form, I consent to the processing of my personal data

Message sent.

Thank you!

Something went wrong. Try again

partner@airi.net

For scientific cooperation and
partnership

pr@airi.net

For journalists and media