ru

About
Publications
Blog
Careers

ru

Source

ACL

DATE OF PUBLICATION

07/14/2023

Authors

Ivan Oseledets

Olga Tsymboi Danil Malaev Andrei Petrovskii

Share

Layerwise universal adversarial attack on NLP models

Abstract

In this work, we examine the vulnerability of language models to universal adversarial triggers (UATs). We propose a new white- box approach to the construction of layerwise UATs (LUATs), which searches the triggers by perturbing hidden layers of a network. On the example of three transformer models and three datasets from the GLUE benchmark, we demonstrate that our method provides better transferability in a model-to-model setting with an average gain of 9.3% in the fooling rate over the baseline. Moreover, we investigate triggers transferability in the task-to-task setting. Us- ing small subsets from the datasets similar to the target tasks for choosing a perturbed layer, we show that LUATs are more efficient than vanilla UATs by 7.1% in the fooling rate.

Full text

Similar publications

The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models

Anton Razzhigaev, Matvey Mikhalchuk, Elizaveta Goncharova, Ivan Oseledets, Denis Dimitrov,

SOURCE

Feature-Based Pipeline for Improving Unsupervised Anomaly Segmentation on Medical Images

Daria Frolova, Aleksandr Katrutsa, Ivan Oseledets

SOURCE

Expert Systems with Applications

Multiparticle Kalman filter for object localization in symmetric environments

Roman Korkin, Ivan Oseledets, Aleksandr Katrutsa

SOURCE

Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

Georgii Novikov, Daniel Bershatsky, Julia Gusak, Alex Shonenkov, Denis Dimitrov, Ivan Oseledets

SOURCE

SIAM Journal on Scientific Computing

Black Box Approximation in the Tensor Train Format Initialized by ANOVA Decomposition

Andrei Chertkov, Gleb Ryzhakov, Ivan Oseledets

SOURCE

Doklady Mathematics

Trusted Artificial Intelligence: Challenges and Promising Solutions

D. Yu. Turdakov, A. I. Avetisyan, K. V. Arkhipenko, A. V. Antsiferova, D. S. Vatolin, S. S. Volkov, A. V. Gasnikov, D. A. Devyatkin, M. D. Drobyshevsky, A. P. Kovalenko, M. I. Krivonosov, N. V. Lukashevich, V. A. Malykh, S. I. Nikolenko, Ivan Oseledets, A. I. Perminov, I. V. Sochenkov, M. M. Tikhomirov, A. N. Fedotov, M. Yu. Khachay

SOURCE

Hyperbolic Vision Transformers: Combining Improvements in Metric Learning

Aleksandr Ermolov, Leyla Mirvakhabova, Valentin Khrulkov, Nicu Sebe, Ivan Oseledets

SOURCE

Artificial Intelligence Research Institute AIRI

You can ask us a question or suggest a joint project in the field of AI

About
Publications
Blog
Careers

partner@airi.net

For scientific cooperation and
partnership

pr@airi.net

For journalists and media

people@airi.net

For any questions connected with
employees and employment

© 2024, AIRI

Join AIRI

Name Email Your message I'm not a robot By submitting the form, I consent to the processing of my personal data

Message sent.

Thank you!

Something went wrong. Try again

About
- Values
- Numbers
- Focus areas
- Research
- Partners
- Management
- Contacts
Publications
Blog
Careers

Contact us

Join AIRI

You can ask us a question or suggest a joint project in the field of AI

Name Email Your message I'm not a robot By submitting the form, I consent to the processing of my personal data

Message sent.

Thank you!

Something went wrong. Try again

partner@airi.net

For scientific cooperation and
partnership

pr@airi.net

For journalists and media