Source
CINTI
DATE OF PUBLICATION
01/12/2022
Authors
Ilya Makarov Denis Zuenko
Share

Style-transfer Autoencoder for Efficient Deep Voice Conversation

Abstract

We consider the problem of voice cloning, which is desirable in many film-related industries, and developed a new modification of the AutoVC state-of-the-art model in the task of voice conversion. We studied the replacement of recurrent modules with convolutional layers while maintaining the quality of the original model. The result of our work showed the speed improvement on longer voice tracks and faster training with the tiniest deterioration in sound quality, as evidenced by the reconstitution loss and Mel-cepstral distortion.

Join AIRI