Tatyana Shavrina Valentin Malykh Dina Pisarevskaya

Building a Bilingual QA-system with ruGPT-3


In this work, we present an approach of cross-lingual transfer learning for English and Russian languages within the QA task. Our approach implies using a generative transformer model that has seen Wikipedia texts in both languages during the pretraining phase and then ne-tuning it with a special token of language, forcing the model to generate texts in a particular language. We are focusing on SQuAD data (English) and updated SberQuAD data (Russian) plus their translations for training and testing, and use ruGPT-3 XL model, which is forced to answer questions in English based on Russian paragraphs and reverse: can answer in Russian when provided information in English. Monolingual QA-abilities of the model are also preserved.
Our results show that the ne-tuned model demonstrates bilingual ability and can generate answers that are close to correct in fuzzy metrics: model generates answers in Russian when based on English texts: 75% named entities ratio, 28% Levenshtein Distance string matching, 28% ROUGEL; model generates answers in English when based on Russian data:
51% named entities ratio, 27% Levenshtein Distance string matching,
27% ROUGE-L; monolingual Russian quality: 83% named entities ratio,
59% Levenshtein Distance string matching, 57% ROUGE-L; monolingual
English quality: 52% named entities ratio, 24% Levenshtein Distance string matching, 25% ROUGE-L

Join AIRI on social media