Inter-organizational Multi-Agent Collaboration Technology

Remark: This document was translated using generative AI technology.

Hello, we are Asai, Akima, and Takemori from the Artificial Intelligence Laboratory. In this article, we introduce our work on collaboration technologies for multi-agent systems — a key challenge when applying such systems to real-world settings where agents (often representing different organizations) have diverse and sometimes conflicting objectives.

Multi-AI-Agent Collaboration Across Organizational Boundaries

Research and practical applications of AI agents—autonomous systems capable of acting independently—are rapidly advancing. Beyond single AI agents designed to tackle complex tasks on their own, there is growing interest in multi-agent systems, where multiple agents with different capabilities work together in a coordinated manner (Tran et al., 2025, Chen et al., 2024). In real-world and enterprise contexts, these AI agents are increasingly expected to collaborate across organizational boundaries, integrating diverse goals and knowledge from different entities.

In this blog post, we introduce two key technologies that enable effective collaboration among AI agents operating across organizations.

Optimization under incomplete information (Incomplete Information Optimization)
Adaptive Evolution Technology, which allow agents to autonomously evolve their behavioral strategies under uncertainty

Optimization under Incomplete Information

When the environment changes or the data used to train AI agents is updated, it becomes necessary—just as in conventional machine learning operations—to re-optimize or fine-tune multi-AI-agent systems. There are two main challenges in optimizing such systems.

The first challenge is that multi-AI-agent systems inherently operate under incomplete information. An AI agent owned by an organization may be a white-box model, allowing access to its model weights. However, when coordinating multiple AI agents to solve complex tasks, not all agents are necessarily white-box models. For example, some agents may belong to partner organizations, use LLMs accessed through APIs, or even involve human experts and specialists in their decision-making processes. In other words, the internal information of each AI agent is often unknown, and only their inputs and outputs are observable. We refer to the optimization of such multi-AI-agent systems as incomplete information optimization.

While existing methods such as black-box optimization can be applied, as illustrated in the figure below, each AI agent has parameters $\theta_1, \theta_2, \cdots$ making the optimization exponentially more difficult as the number of agents increases. This constitutes the second major challenge.

Incomplete Information Optimization on Multi-Agent System

Technical Approach and Impact

To address the challenges described above, we developed a technology that enables efficient optimization of multi-AI-agent systems even under incomplete information.

As illustrated in the figure above, our approach allows not only the optimization of the final objective metric (such as user satisfaction or the overall system performance) but also the use of intermediate evaluation values from each AI agent during the optimization process. By leveraging these intermediate metrics, the optimization can be performed more efficiently. This setup is often referred to as function network optimization or gray-box optimization (Astudillo and Frazier, 2021).

However, a remaining challenge lies in determining which intermediate evaluation values are appropriate for achieving the overall optimization objective. To overcome this, we developed an incomplete information optimization technique that simultaneously optimizes both the intermediate evaluation metrics and the final system objective.

In the experiment shown in the figure below, we applied incomplete information optimization to a supply chain management (SCM) example. This multi-AI-agent system consists of two AI agents (models): one for demand forecasting and another for inventory optimization. In the simulation environment, we compared our proposed method with an existing black-box optimization approach (LogEI).

The horizontal axis represents the number of times the final evaluation value was observed, and the vertical axis shows the final evaluation value (average reward, where higher is better). Even though the system is relatively small—comprising only two AI agents—the results show that our proposed method (blue) achieves optimization more efficiently than the existing approach (orange).

Adaptive Evolotion Technology

Technical Overview

When multiple AI agents negotiate or cooperate with others, they must make decisions under incomplete information, where the other parties’ intentions and interests are not fully known. The adaptive evolution technique enables each agent to autonomously evolve its behavioral strategies even in such uncertain environments by estimating the other party’s preferences and strategies.

Specifically, this approach combines a belief space, which updates estimations of the counterpart’s values in a Bayesian manner based on fragmentary dialogue information, with a utility space, which visualizes the satisfaction (utility) of both sides. This combination achieves a balance between efficiency and fairness.

For strategy exploration, we build upon the Monte Carlo Tree Search (MCTS) framework used in the prior study AFlow (Zhang et al., 2025), extending it to iteratively modify and evaluate strategies. Through repeated negotiation experiences, agents can autonomously generate increasingly refined strategies.

As a result, the system not only maximizes individual gains but also leads to mutually acceptable and envy-free agreements, enabling multi-AI-agent systems to perform effectively even under uncertainty.

Preliminary Evaluation

In our initial evaluation, we tested the effectiveness of the proposed method in a scenario simulating procurement condition negotiations in the retail industry. The setting included multiple negotiation factors such as rebates, promotional support, and volume commitments.

The agents equipped with the adaptive evolution technique demonstrated a higher agreement rate and more stable utility compared to traditional fixed-strategy or rule-based approaches. Furthermore, agents that retrained based on past negotiation histories autonomously adjusted the balance between concession and assertion according to the counterpart’s communication patterns, achieving the highest scores in both overall utility and fairness metrics.

These results confirm that the adaptive evolution technique enables efficient and fair agreement formation even under incomplete information environments.

For readers who want to learn more

At the Artificial Intelligence Laboratory, we are actively presenting and publishing our research on multi-agent collaboration technologies across organizations under incomplete information, along with their practical applications, through various channels.

If you’re interested, please check out the links below for more details.