Source
ACL Workshop
DATE OF PUBLICATION
07/31/2025
Authors
Ekaterina Antropova Egor Kratkov Roman Derunets Margarita Trofimova Ivan Bondarenko Alexander Panchenko Vasily Konovalov Maksim Savkin
Share

TabaQA at SemEval-2025 Task 8: Column Augmented Generation for Question Answering over Tabular Data

Abstract

The DataBench shared task in the SemEval-2025 competition aims to tackle the problemof question answering (QA) from tabular data.Given the diversity of the structure of tables,there are different approaches to retrieving theanswer. Although Retrieval-Augmented Generationis a viable solution, extracting relevantinformation from tables remains a significantchallenge. In addition, the table can be prohibitivelylarge for direct integration into theLLM context. In this paper, we address QAover tabular data first by identifying relevantcolumns that might contain the answers, thenthe LLM generates answers by providing thecontext of the relevant columns, and finally,the LLM refines its answers. This approachsecured us 7th place in the DataBench lite category.

Join AIRI