ru

About
Publications
Blog
Careers

ru

Source

ECAI

DATE OF PUBLICATION

10/19/2024

Authors

Bashar M. Deeb Andrey Savchenko Ilya Makarov

Share

CA-SER: Cross-Attention Feature Fusion for Speech Emotion Recognition

Abstract

In this paper, we introduce a novel tool for speech emotion recognition, CA-SER, that borrows self-supervised learning to extract semantic speech representations from a pre-trained wav2vec 2.0 model and combine them with spectral audio features to improve speech emotion recognition. Our approach involves a self-attention encoder on MFCC features to capture meaningful patterns in audio sequences. These MFCC features are combined with high-level representations using a multi-head cross-attention mechanism. Evaluation of speech emotion recognition on the IEMOCAP dataset shows that our system achieves a weighted accuracy of 74.6%, outperforming most existing techniques.

Full text DOWNLOAD pdf

Similar publications

Automatic Interpretation of Ancient Egyptian Texts for Education and Research

Maksim Golyadkin, Innokentiy Humonen, I. Plevokas, EKATERINA BUREEVA, EKATERINA ALEXANDROVA, Ilya Makarov

SOURCE

Search Swarm: Multi-agent Large Language Models Framework for E-commerce Product Search

Nagim Isyanbaev, Ilya Makarov

SOURCE

Machine Learning Driven Optimization of Fe-Based TMCs for Photodynamic Therapy

Vladimir Manuilov, Antonio Francés Monerris, Abdelazim Abdelgawwad, Daniel Escudero, Ilya Makarov

SOURCE

ATGen: A Framework for Active Text Generation

Akim Tsvigun, Daniil Vasilev, Ivan Tsvigun, Ivan Lysenko, Talgat Bektleuov, Aleksandr Medvedev, Uliana Vinogradova, Nikita Severin, Mikhail Mozikov, Andrey Savchenko, Rostislav Grigorev, Ramil Kuleev, Fedor Zhdanov, Artem Shelmanov, Ilya Makarov

SOURCE

Scientific Reports

An iterative strategy to design 4-1BB agonist nanobodies de novo with generative AI models

Ivan Poddiakov, Dmitriy Umerenkov, Irina Shulcheva, Victoria Golovina, Vasilina Borisova, Irina Pozdnyakova-Filatova, Evgeniy Loktyushov, Galina Zubkova, Andrey Savchenko, Andrei Ulitin, Pavel Blinov

SOURCE

MatMuls are Enough for Efficient and Performant Linear-Time Attention

Andrew Argatkiny, Ilya Makarov

SOURCE

Optimizing state monitoring with domain degradation knowledge

Dmitry Zhevnenko, Ilya Makarov

SOURCE

AIRI Institute

You can ask us a question or suggest a joint project in the field of AI

About
Publications
Blog
Careers

event@airi.net

For events invitations

partner@airi.net

For scientific cooperation and
partnership

pr@airi.net

For journalists and media

people@airi.net

For any questions connected with
employees and employment

© 2025, AIRI

Join AIRI

Name Email Your message I'm not a robot By submitting the form, I consent to the processing of my personal data

Message sent.

Thank you!

Something went wrong. Try again

About
- Values
- Numbers
- Focus areas
- Research
- Partners
- Management
- Contacts
Publications
Blog
Careers

Contact us

Join AIRI

You can ask us a question or suggest a joint project in the field of AI

Name Email Your message I'm not a robot By submitting the form, I consent to the processing of my personal data

Message sent.

Thank you!

Something went wrong. Try again

partner@airi.net

For scientific cooperation and
partnership

pr@airi.net

For journalists and media