Source
ACL
DATE OF PUBLICATION
07/27/2025
Authors
Akim Tsvigun Daniil Vasilev Ivan Tsvigun Ivan Lysenko Talgat Bektleuov Aleksandr Medvedev Uliana Vinogradova Nikita Severin Mikhail Mozikov Andrey Savchenko​ Rostislav Grigorev Ramil Kuleev Fedor Zhdanov Artem Shelmanov Ilya Makarov
Share

ATGen: A Framework for Active Text Generation

Abstract

Active learning (AL) has demonstrated remarkablepotential in reducing the annotation effortrequired for training machine learning models.However, despite the surging popularity of naturallanguage generation (NLG) tasks in recentyears, the application of AL to NLG has beenlimited. In this paper, we introduce Active TextGeneration (ATGen) – a comprehensive frameworkthat bridges AL with text generation tasks,enabling the application of state-of-the-art ALstrategies to NLG. Our framework simplifiesAL-empowered annotation in NLG tasks usingboth human annotators and automatic annotationagents based on large language models(LLMs). The framework supports LLMsdeployed as services, such as ChatGPT andClaude, or operated on-premises. Furthermore,ATGen provides a unified platform for smoothimplementation and benchmarking of novel ALstrategies tailored to NLG tasks. Finally, wepresent evaluation results for state-of-the-artAL strategies across diverse settings and multipletext generation tasks. We show that ATGenreduces both the effort of human annotators andcosts associated with API calls to LLM-basedannotation agents. The code of the frameworkis available on GitHub1 under the MIT license.The video presentation is available athttp://atgen-video.nlpresearch.group

Join AIRI