Источник
Труды ИСП РАН
Дата публикации
24.02.2025
Авторы
Георгий Сазонов
Кирилл Лукьянов
Серафим Боярский
Илья Макаров
Поделиться
Is AI interpretability safe: the relationship between interpretability and security of machine learning models
interpretability,
robustness,
attacks on AI models,
Black-box attacks,
graph node classification,
trusted AI
Аннотация
With the growing application of interpretable artificial intelligence (AI) models, increasing attention is being paid to issues of trust and security across all types of data. In this work, we focus on the task of graph node classification, highlighting it as one of the most challenging. To the best of our knowledge, this is the first study to comprehensively explore the relationship between interpretability and robustness. Our experiments are conducted on datasets of citation and purchase graphs. We propose methodologies for constructing black-box attacks on graph models based on interpretation results and demonstrate how adding protection impacts the interpretability of AI models.
Похожие публикации
Вы можете задать нам вопрос или предложить совместный проект в области ИИ
partner@airi.net
По вопросам научного
сотрудничества и партнерства
сотрудничества и партнерства
pr@airi.net
Для журналистов и СМИ