
Hello, we are Vempalli Saketh, Siddartha Reddy Thummaluru, Harsh Pandey and Mahesh Chandran from the Artificial Intelligence Research Laboratory at Fujitsu Research of India (FRIPL). We are excited to share our latest research focused on making Graph AI more transparent, trustworthy, and actionable.
As explored in previous blog Scaling Graph AI to Billion-sized Graphs, graph structures have firmly established themselves as the backbone of modern digital infrastructure. Whether it is social networks analyzing billions of connections or financial systems monitoring vast transaction flows, the industry has made tremendous strides in processing massive interconnected datasets. However, as we build upon these scalable architectures, a critical trade-off emerges: the sheer complexity required to handle this scale often deepens the opacity of the models.
However, raw processing power is not enough. While we can now compute over billions of edges, the sophisticated architectures required to do so—such as deep Graph Neural Networks (GNNs)—often sacrifice transparency for performance. In our experience, this trade-off creates a formidable barrier to adoption. We call this the "black box" dilemma: having a model that works perfectly, but lacking the ability to verify the validity of its logic in critical scenarios.
When a model flags a transaction as fraudulent or predicts a new drug interaction, stakeholders invariably ask: Why? As we tackled this challenge, we realized that trust in Graph AI requires a two-pronged approach.
Explainability is the first key requirement. When you already have a trained black box model, explainability aims to make the decisions understandable. It focuses on uncovering the patterns these models have learned, helping to build trust and clarity, and even enabling us to identify and correct undesirable or biased patterns.
Interpretability is the second requirement. Rather than explaining the decisions after they are made, interpretability asks whether models can be transparent by design. Can we construct architectures that "show their work" as they compute, making the reasoning process intrinsic to the model itself?
This blog presents our path-breaking technologies around these two concepts, not as competing solutions but as complementary pillars of Trustworthy Graph AI. They were presented by FRIPL researchers in just concluded NeurIPS 2025 conference at San Diego, US, an annual gathering of AI researchers from academia and industries.
1. Graph Explainability: Natural language explanations of GNNs
Graph Neural Networks (GNNs) remain largely black-box models. Previous research works explain the inference of GNNs either locally (or instance level explanations), for e.g., RCExplainer [1] and MEG [2] or the global explanation is highly dependent on motif discovery such as GNNinterpreter [3] and GLGExplainer [4]. A local explanation for the model inference attempts to justify why an individual node belongs to a specific class, but they fall short of answering a more fundamental question:
Q: What global logic does the model follow for an entire class?
We address this gap for node classification in our recent collaborative work with Prof. Sayan Ranu and his research group from IIT Delhi, leading to an oral presentation “GnnXemplar: Exemplars to Explanations — Natural Language Rules for Global GNN Interpretability” at NeurIPS 2025. The paper is co-authored by Burouj Armgaan, Eshan Jain, Harsh Pandey, Mahesh Chandran and Sayan Ranu. In this work instead of relying on complex subgraph patterns, we propose a human-centric approach that aligns model explanations with how people naturally reason by leveraging the expressive power of large language models (LLMs).
In our work, we suggest that effective explanations of GNN predictions must satisfy two key criteria,
- they should be faithful, meaning they accurately reflect the model's decision-making process, and
- they should be interpretable, allowing humans to understand the reasoning behind the predictions despite the underlying complexity of the graph modality.
The Key Idea: Exemplar Theory and Natural Language Explanations
GnnXemplar draws inspiration from Exemplar Theory in cognitive science. According to this theory, humans categorize new instances by comparing them to concrete, representative examples stored in memory, rather than relying on abstract rules.
We bring this idea to GNN explainability by identifying exemplar nodes—nodes in the embedding space that best capture how the model internally represents a given class. These exemplars serve as anchors for understanding the model’s global behavior.
Furthermore, to enhance interpretability, we move beyond subgraph visualizations which are inherently ineffective in large, dense graphs (mostly observed in node classification datasets) dues to their complexity and scale. Instead, we distill the common characteristic (we refer this as signature) of the exemplar node and its associated population into textual explanations. This makes the explanations more direct and easier to understand.
How GnnXemplar Works
Below, Figure 1 show the complete pipeline of GnnXemplar. The pipeline is divided into two stages: the first stage concerns with discovering exemplars nodes in the graph for each class, and the second stage converts the signature of these exemplar nodes into natural language representation. Together, the two stages convert the high-dimensional representations (also called embeddings) which these black box models compute and learn into human-readable explanations.

Let's drill down deeper into the two stages.
1. Discovering Representative Exemplars:
Identifying representative nodes is non-trivial. We frame this as a coverage maximization problem in the embedding space.
- Reverse k-Nearest Neighbors (Rev-k-NN): Rather than asking which neighbors a node is close to, we ask how many nodes consider the given node their nearest representative, i.e., which node are most popular. Nodes that attract many such references are strong exemplar candidates.
- Greedy Approximation: While selecting a subset from these popular nodes we need to optimize for the coverage of all the training nodes. Since optimal coverage selection is NP-hard, we use an efficient greedy strategy to select a diverse set of exemplars that collectively represent the class well. We iteratively select nodes that maximize marginal gain in coverage. This avoids redundancy among nodes that share most of their population, since selecting one reduces the marginal utility of the other nodes, thus promoting both coverage and diversity within the budget.
2. Exemplars to Natural Language Rules:
Once exemplars are identified, the next challenge is discovery of exemplar signatures in natural language. To this end, we leverage the reasoning capabilities of large language models (LLMs). Our goal is to derive a symbolic rule, expressed as natural language that closely matches the GNN's prediction. The idea is to give LLM some positive samples and some negative samples and extract a signature which is present in positive sample but not in the negative sample. We divide the step into two parts:
- Initial Prompt Generation:
For a given exemplar, we first make a positive set which has the exemplar node and a randomly picked subset from its population. Then we make the negative set by randomly picking from the rest of the nodes in the training dataset. Once we have the positive and negative sets, we serialize the nodes and its neighborhood so that it could be passed to LLM for extraction of signature. If the black box GNN is
L-hop, we give summary of theL-hop neighborhood of each node in the positive and negative set. The summary includes:- The attributes of the node.
- The normalized frequency distribution of GNN predicted class labels for all nodes in each hop.
- The average L1 distance per attribute between the exemplar and all nodes at each hop level.
- Self-Refinement, LLM-Based Rule Generation: Rather than querying the LLM only once to produce natural language signature, we adopt a more robust iterative approach. The LLM is repeatedly prompted, and its outputs are validated on a held-out validation set. In the initial prompt the LLM is asked to generate a Python code for testing the presence of the signature, and then it is asked to generate a natural language rule from this function. The Python function helps us evaluate the signature, identify failure nodes and iteratively re-prompt the LLM with problematic nodes resulting in progressively refined natural language signatures.
An example of a signature is: Classify as Neural Networks Research Paper if the node mentions Neural Networks and connects to at least three Computer Science nodes.
Why GnnXemplar Matters
Traditional global explainers often depend on recurring subgraph motifs—an approach that breaks down on large, real-world graphs where such motifs are rare. GnnXemplar overcomes this key limitation by using signatures and on top of it offers:
- Scalability: Sampling-based Rev-k-NN enables explanations on massive graphs like OGB-ArXiv.
- Human Interpretability: Generates natural language rules which users consistently prefer especially over dense subgraph visualizations. This is specifically important to bring explanation of AI model inference within the reach of non-experts.
- High Fidelity: The explanations accurately reflect the model’s internal reasoning without simplifying the model itself.
- Faithfulness to GNN: The high-fidelity numbers and the use of GNN's node embeddings makes it more faithful to the GNN.
Figure 2 compares across different graph explainability method on a variety of datasets and shows clear advantage of using GnnXemplar over other methods. Do note that the metric fidelity compares the alignment of explanations with the ground truth, i.e., if GNN predicts a node as class A, the explainability method should also predict it as class A (even if A is not the ground truth for the given node). This is because we are interested in alignment of explainability method with the black box GNN model and not the ground truth. We see most other methods fail to give even decent fidelity because most of them rely on motif-based explanations and for node classification often there is no discrete motif to identify.

Takeaway and What's Next
GnnXemplar represents a shift in AI explainability. By combining the geometric strengths of GNNs with the linguistic reasoning of LLMs and the psychological foundations of Exemplar Theory, it provides a roadmap for making complex AI systems truly transparent to the humans who use them.
Stay tuned as we are now working on modifying this approach in order to apply it to more practical and real-world huge graphs with number of nodes running millions!
2. Self-Interpretable Graph Models: Transparency by Design
While explaining existing models is powerful, we also pursued a complementary direction focused on intrinsic transparency by asking the following question:
Can we engineer a "glass box" architecture that is inherently transparent yet equally powerful?
This led us to the domain of Self-Interpretable Graph Models. Unlike post-hoc approaches that approximate the reasoning of a frozen model, self-interpretable architectures inherently encode their reasoning process. The "explanation" is not a separate calculation; it is the path the model took to reach its answer.
The distinct advantage here is fidelity. Because the explanation is intrinsic to the computation, there is no gap between what the model did and what the explanation says it did. This guarantees that the reasoning provided is exactly what drove the prediction. Moreover, self-interpretable models often lead to better generalization. By forcing the model to focus on sparse, meaningful sub-structures (like specific chemical motifs in drug discovery or circular trading loops in fraud detection), we encourage it to learn robust, causal patterns rather than noisy correlations. This results in models that are not only transparent but often more robust to distribution shifts in real-world data.
We are proud to highlight our latest contribution to this field, G-NAMRFF (Graph Neural Additive Model with Random Fourier Features) which was presented at NeurIPS 2025 conference. Link for project page, Poster and paper: NeurIPS Poster Interpretable and Parameter Efficient Graph Neural Additive Models with Random Fourier Features The core idea of G-NAMRFF is shown by the architecture diagram Fig 3
The "Sum of Parts" Philosophy
Standard Graph Neural Networks are often opaque because they rely on highly entangled operations. When a GNN aggregates data from neighbors, it mixes features in a non-linear "black box" (typically a Multi-Layer Perceptron) where the contribution of any single variable becomes impossible to trace.
G-NAMRFF solves this by adopting a Generalized Additive Model (GAM) approach, inspired by recent successes in Neural Additive Models (NAM) [5]. Instead of a complex, entangled mix, the model forces the prediction to be a simple sum of individual feature contributions.
Imagine we are predicting whether a bank account is compromised. In a standard GNN, the "Account Age" and "Transaction Amount" are multiplied and transformed together, obscuring which one mattered more. In G-NAMRFF, the model learns a specific shape function for "Account Age" and a separate one for "Transaction Amount." The final risk score is simply the sum of these independent scores.

The Core Innovation: Gaussian Processes with Random Fourier Features
The novelty of G-NAMRFF lies in its principled integration of Bayesian nonparametric modeling with computationally efficient deep learning approximations. Instead of using conventional neural network layers to learn the feature-wise shape functions fk which typically require a large number of parameters, G-NAMRFF represents each feature’s contribution using a Gaussian Process (GP). This design enables flexible, uncertainty-aware function modeling while preserving interpretability at the feature level.
By treating feature contributions as GPs, we allow the model to learn smooth, complex, and non-linear patterns that fit the data naturally, rather than forcing simple linear relationships. To make this scalable, we employ the Random Fourier Features (RFF) approximation technique [6].
- Implicit Kernel Mapping: RFFs allow us to map input data into a randomized feature space where complex non-linear kernel operations become simple linear dot products.
- Single-Layer Learning: By projecting features into this RFF space, we can learn the complex shape functions using a lightweight, single-layer neural network.
However, efficient feature processing is only half the battle. To truly leverage the power of Graph AI, our architecture must do more than just process node features in isolation; it must inherently respect the connectivity of the graph. The final piece of our architectural puzzle is ensuring that the "glass box" remains graph-aware. We developed a specialized kernel that explicitly incorporates the graph structure via the Laplacian. This ensures that when the model learns the "shape" and impact of a specific feature, it is not analyzing the node in a vacuum. It considers the context of the node's neighbors, integrating local aggregation directly into the interpretable additive framework. The architecture achieves improved parameter efficiency when compared to previous attempts at additive graph models like GNAN [7].
Examples from Mutagenicity
In drug discovery, identifying whether a molecule is mutagenic is critical. Standard black-box models might correctly flag a molecule as toxic but fail to reveal which part of the chemical structure is responsible for the overall property [8]. G-NAMRFF goes beyond simple feature importance by providing local structure explanations. By aggregating feature-level attributions along the graph neighborhood, the model highlights the specific subgraph responsible for the prediction. In our experiments, the model automatically extracted and highlighted Nitro (NO2) and Amino (NH2) groups, along with aromatic rings, as the dominant substructures driving mutagenicity. This aligns perfectly with established literature. This is shown in the figure
Results and Discussion
Matching State-of-the-Art Performance: We tested G-NAMRFF across diverse benchmarks, from citation networks like PubMed and Cora to large-scale datasets like ogbn-arxiv. The results were conclusive: our interpretable architecture matches, and in many cases exceeds, the accuracy of leading "black-box" models like Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs). This proves that we can achieve rigorous transparency without compromising on predictive power
Insights from Filter Orders: Beyond raw accuracy, our ablation studies revealed fascinating behaviors in how the model learns. By analyzing the learned Finite Impulse Response (FIR) filters, we observed that the model automatically learns to prioritize information from closer neighbors while still aggregating necessary context from further away
Conclusions
In this blog, we addressed two critical dimensions of Trustworthy AI: explaining the decisions of complex black-box models and designing new architectures that are transparent from the ground up. Our in-house explainability framework provides the flexibility to audit and understand diverse GNNs across the organization, ensuring compliance and aiding in model debugging. Complementing this, our self-interpretable architecture, G-NAMRFF, paves the way for the next generation of Graph AI, where high performance and transparency coexist naturally. At our lab we continue to use these technologies to resolve real world, complicated graph tasks like fraud detection, credit score assignment etc.
Connect with us to know more about these technologies. We would love to hear from you!
References
[1] Wang, Xiang, et al. "Reinforced causal explainer for graph neural networks." IEEE Transactions on Pattern Analysis and Machine Intelligence 45.2 (2022): 2297-2309.
[2] Numeroso, Danilo, and Davide Bacciu. "Meg: Generating molecular counterfactual explanations for deep graph networks." 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021.
[3] Wang, Xiaoqi, and Han-Wei Shen. "Gnninterpreter: A probabilistic generative model-level explanation for graph neural networks." arXiv preprint arXiv:2209.07924 (2022).
[4] Azzolin, Steve, et al. "Global explainability of gnns via logic combination of learned concepts." arXiv preprint arXiv:2210.07147 (2022).
[5] Agarwal, Rishabh, et al. "Neural additive models: Interpretable machine learning with neural nets." Advances in neural information processing systems 34 (2021): 4699-4711.
[6] Rahimi, Ali, and Benjamin Recht. "Random features for large-scale kernel machines." Advances in neural information processing systems 20 (2007).
[7] Bechler-Speicher, Maya, Amir Globerson, and Ran Gilad-Bachrach. "The intelligible and effective graph neural additive network." Advances in Neural Information Processing Systems 37 (2024): 90552-90578.
[8] Kazius, Jeroen, Ross McGuire, and Roberta Bursi. "Derivation and validation of toxicophores for mutagenicity prediction." Journal of medicinal chemistry 48.1 (2005): 312-320.