Источник
NLDB
Дата публикации
04.07.2025
Авторы
Михаил Сальников Андрей Саховский Ирина Никишина Аида Усманова Angelie Kraft Cedric Möller Debayan Banerjee Junbo Huang Longquan Jiang Rana Abdullah Xi Yan Елена Тутубалина Ricardo Usbeck Александр Панченко
Поделиться

ShortPathQA: A Dataset for Controllable Fusion of Large Language Models with Knowledge Graphs

Аннотация

In this work, we release the Shortest Path subgraph Question Answering(ShortPathQA) dataset, the first dataset that provides textual questions withpre-computed relevant subgraphs retrieved from the Wikidata knowledge graph,standardizing the evaluation framework for Knowledge Graph Question Answering(KGQA). For this purpose, we utilize the Mintaka dataset for both trainingand testing and additionally create a manual question-answering subset for testing.Our baseline experiments with both supervised approaches and unsupervisedLarge Language Model (LLM) inference indicate that even a simplified KGQAformulation with given Knowledge Graph (KG) subgraphs and candidate answersremains challenging. Our analysis has shown that LLMs are unable to correctlyprocess and utilize graph data structures without detailed prompt engineering ormodel tuning. This limitation highlights the need for the creation of this dataset asa training ground for the development of methods that enable LLMs to work moreeffectively with graph data.


Присоединяйтесь к AIRI в соцсетях