ATGen: A Framework for Active Text Generation
Abstract
Active learning (AL) has demonstrated remarkablepotential in reducing the annotation effortrequired for training machine learning models.However, despite the surging popularity of naturallanguage generation (NLG) tasks in recentyears, the application of AL to NLG has beenlimited. In this paper, we introduce Active TextGeneration (ATGen) – a comprehensive frameworkthat bridges AL with text generation tasks,enabling the application of state-of-the-art ALstrategies to NLG. Our framework simplifiesAL-empowered annotation in NLG tasks usingboth human annotators and automatic annotationagents based on large language models(LLMs). The framework supports LLMsdeployed as services, such as ChatGPT andClaude, or operated on-premises. Furthermore,ATGen provides a unified platform for smoothimplementation and benchmarking of novel ALstrategies tailored to NLG tasks. Finally, wepresent evaluation results for state-of-the-artAL strategies across diverse settings and multipletext generation tasks. We show that ATGenreduces both the effort of human annotators andcosts associated with API calls to LLM-basedannotation agents. The code of the frameworkis available on GitHub1 under the MIT license.The video presentation is available athttp://atgen-video.nlpresearch.group
Similar publications
partnership