Please enable JavaScript in your browser.

Hybrid Graphs for Table-and-Text based Question Answering using LLMs - accepted at NAACL 2025 - fltech - Technology Blog of Fujitsu Research

fltech - Technology Blog of Fujitsu Research

A technology blog where Fujitsu researchers talk about a variety of topics

Hybrid Graphs for Table-and-Text based Question Answering using LLMs - accepted at NAACL 2025

Introduction

Hello! I am Ankush Agarwal, a Researcher at the Artificial Intelligence Laboratory of Fujitsu Research India. My research focuses on knowledge graphs and deep learning, particularly their applications in the enterprise domain.

Recently, our research paper, 'Hybrid Graphs for Table-and-Text based Question Answering using LLMs,' was accepted at NAACL 2025, one of the top conferences in AI-NLP.

Authors: Ankush Agarwal, Ganesh S, Chaitanya Devaguptapu

Read Our NAACL 2025 Paper: https://aclanthology.org/2025.naacl-long.39/

TL;DR: Our method, ODYSSEY transforms tables and text into a unified Hybrid Graph tailored to the question for Table-Text Question Answering. It filters out irrelevant details from Table and Text, enabling LLMs to reason effectively without fine-tuning. The method achieves state-of-the-art zero-shot performance while minimizing token usage.

Overview

1. Background

Tables and text are crucial data sources in industries such as healthcare, finance, and e-commerce. Many real-world applications, such as chatbot systems, require reasoning and aggregation across both formats. As a result, the ability to effectively process and answer questions from diverse data sources has become increasingly important. Despite its importance, most existing methods focus on a single data source—either structured or unstructured. Approaches that handle both tables and text typically require training data with pre-established connections between the two. However, in real-world applications, such training data is often unavailable and expensive to collect.

2. Proposed Method

To address table-text challenges, we propose ODYSSEY, an approach that enhances LLMs' ability to answer hybrid questions by efficiently retrieving and leveraging relevant context. Our method operates in a zero-shot setting, filtering out noise and providing the LLM with concise, relevant information. At a high level, ODYSSEY consists of two main steps: (i) constructing a unified Hybrid Graph from both textual and tabular data, and (ii) pruning the graph based on the question.

Key Components

We will explore the components of the ODYSSEY framework (Figure 1).

a) Question Analysis: We analyze the question to identify key entities and entity types that will later aid in graph construction, traversal, and answering the question.

b) Hybrid Graph Construction: Our Hybrid Graph consists of two connected components: the table and documents. By selecting relevant table headers, we retrieve the sub-table. Simultaneously, from the documents, we construct the Entity-Document graph. These components are then integrated to form the Hybrid Graph.

c) Hybrid Graph Traversal: After constructing the Hybrid Graph, we prune it based on the question to remove noise. Using the entity-header mapping dictionary derived from the text and table, we perform a Breadth-First Search to semantically match question entities with table column cells and entities in the entity-document graph.

Figure 1: Overview of the ODYSSEY framework. Our method comprises of 3 steps: i) Question Analysis, ii) Hybrid Graph Construction, and iii) Hybrid Graph Traversal. First, we begin with Question Analysis (1a in the figure) from where we get question entities, retrieved sub-table, and entity-header mapping. Next, we construct the Entity-Document Graph (1b in the figure). Using entity-doc graph and retrieved sub-table, we construct the Hybrid Graph (2 in the figure). At last, we perform Hybrid Graph Traversal (3 in the figure) to get the pruned graph which serves as input for the LLM.

3. Results

Performance Comparison

We evaluated ODYSSEY on the Hybrid-QA and OTT-QA datasets, two widely used Table-Text QA benchmarks. Our results are analyzed under two scenarios: (i) zero-shot baselines using closed and open-source models (Table 1) and (ii) comparison with existing fine-tuned approaches (Table 2).

Table 1: Table-Text QA Evaluation: We analyze Exact Match (EM), F1-Score, Precision (P), Recall (R), and BERTScore-F1 (B) in (%) to compare our method against baselines in a zero-shot setting using Llama3-8B, GPT-3.5, and GPT-4. The results consistently demonstrate significant improvements across datasets, metrics, and various language models. Base (only reader LLM); w/ Table & Text (table and passages relevant to the question); w/ Table & Summarized Text (table with summarized supporting passages); w/o hopwise (pruned information without considering hop-wise extraction).

Table 2: Performance comparison of ODYSSEY with fine-tuning-based and fine-tuning-free approaches on Hybrid-QA and OTT-QA. We evaluate our method against state-of-the-art fine-tuning-based methods as well as fine-tuning-free approaches using GPT-4.

Key Findings from Results:

  • Our method achieves the best performance in a zero-shot setting or w/o fine-tuning across various LLMs and existing approaches.
  • For Hybrid-QA, our method performs comparably to fine-tuning-based approaches and outperforms them on the OTT-QA dataset.

4. Analysis

Our method efficiently processes query context by pruning relevant information and filtering out noise (irrelevant data). The percentage reduction in token usage is shown in Figure 2.

Figure 2: Token Efficiency and Accuracy Improvements. Left: Average input token size and cost across 1000 samples based on OpenAI GPT-4-turbo. Right: Spider plot showing EM, F1, and Query Info Efficiency for Hybrid-QA and OTT-QA.

Key Findings from Analysis:

  • Our method effectively reduces the input token size for LLMs by nearly 50%, enhancing efficiency.
  • The reduction in token size directly correlates with a decrease in computational cost.

5. Conclusion

In this paper, we introduce ODYSSEY, a zero-shot fine-tuning-free approach for Table-Text QA. By leveraging a novel Hybrid Graph-based approach, ODYSSEY effectively navigates the complexities of multi-hop reasoning across structured and unstructured data sources. Our method achieves significant improvements of 10% and 4.5% in Exact Match (EM) scores using GPT-3.5 on the Hybrid-QA and OTT-QA datasets, respectively, compared to the baseline, while reducing the input token size for the LLM reader by 45.5% on Hybrid-QA and 53% on OTT-QA, demonstrating its efficiency in representing relevant information concisely. We believe the insights gained from our method can pave the way for more advanced and efficient QA systems capable of navigating the ever-growing landscape of heterogeneous data sources.