Who Pays the Bill When One Agent Calls Another?

10 Jun 2026

[

]

Who Pays the Bill When One Agent Calls Another?

Without AI Attribution, the Whole Enterprise Does

A customer submits a support request, and your orchestration layer routes it to an agent. That agent queries a knowledge base, then hands the task to a second agent for sentiment analysis, which calls a third for account history, which triggers a fourth for churn prediction, which escalates to a fifth for retention offer generation.

Five agents, five model calls, five cost events, one customer interaction. When the invoice arrives at the end of the month, you see a single line item for API usage. There is no record of which agent spent what, whose budget absorbed the cost, or whether the outcome justified the total price.

This is the financial reality of multi-agent architectures: Costs accumulate at every handoff between agents, and no existing tool is capable of managing them. As organizations rapidly adopt agentic AI — KPMG reports that agent deployment more than doubled in 2025, from 11% to 26% — this out-of-control spend threatens to become a liability.

This article breaks down how unattributable spending compounds in multi-agent chains, why your current tooling cannot close the gap, and how an attribution framework would work.

One Request, Five Agents, Zero Attribution

The shift from single-agent to multi-agent systems introduces a new category of financial exposure. A single agent calling one model has a traceable cost path: one request, one inference, one bill. When that agent delegates subtasks to other agents, the cost path fractures. Each downstream agent may call different models, use different APIs, and draw from different budget pools, but your billing system still records it as one.

Attribution fractures alongside cost. Say a routine support ticket triggers a multi-agent chain, starting with an agent that belongs to the customer success team. It kicks off a sentiment analysis agent that sits under product, then a churn prediction model owned by data science.

Finance sees the aggregate and asks who authorized it. Engineering points to the orchestration layer. The orchestration layer points back to the triggering event, that routine support ticket. Blame is assigned, but the organization is no closer to reducing the overall spend.

Every Handoff Multiplies the Bill

Agent-to-agent handoffs compound costs in ways that traditional technology budgets and tooling are not equipped to handle. Each agent in a chain typically repeats context-gathering steps: re-reading conversation history, re-embedding documents, re-querying APIs that the previous agent already called.

These redundant steps are invisible in aggregate billing but dominate actual token consumption. Errors and unexpected results along the chain only heighten the risk of overspend: When one agent in a chain encounters an ambiguous result, it retries. If the retry produces a different output, the downstream agent retries its own interpretation. A five-agent chain with a single retry at each step does not produce five extra calls. It produces a combinatorial expansion that can generate dozens of redundant inferences before the chain resolves.

Configuration errors heighten the risk of overspend. A single misconfigured threshold, such as the number of retries an agent attempts before escalating, can burn through an entire monthly budget in hours without firing an alert, since the agent is executing flawlessly within its parameters.

Ultimately, poorly governed agentic workflows may incur more in costs than they provide in benefits. Sinan Ozdemir, CTO of UseCrucible.ai, observed that for every successful agent deployment, dozens of other deployments cost roughly three times more than the workflows they replaced while delivering comparable or worse results. Research from Information Matters shows that agentic systems consume five to nine times more tokens per workflow than standard generative AI, citing a cautionary tale from an ecommerce firm that implemented an agentic supply chain optimizer, only to see costs from $5,000 to $50,000 per month due to agents “entering recursive loops during high-volume periods.”

Your Stack Can't Trace the Chain

The challenge with governing agentic spend is that existing infrastructure categories arose around deterministic systems, not autonomous systems that behave in novel ways.

Observability measures performance: uptime, latency, error rates. It has no concept of whether a successful call was economically justified. Your observability platform confirms that each agent completed its calls with a 99% success rate and sub-100ms latency.

FinOps reports historical spend at the account or service level. It cannot authorize or block a specific call in real time based on its business value. Your FinOps tool shows $50,000 in API spend last month.

API gateways enforce access and rate limits. A $1,000 call and a $1 call that produce equivalent output are equally "authorized." Your API gateway confirms every request was authenticated and within rate limits.

Each tool answers a question adjacent to cost attribution, but none follows a dollar from a triggering request through five agent handoffs to a final business outcome. That means you can’t answer the question that actually matters: Should those calls have happened, and who is responsible for the cost?

Give Every Action an Owner

Closing this gap requires two structural changes that operate together: binding every agent action to an accountable owner, and capturing every action as a standard unit of work that can be traced through any chain.

Identity binding means that every agent action carries four pieces of context before it can execute: a human budget owner responsible for the spend, a specific agent instance that performed the action, a business workflow the action belongs to, and a customer context that justifies it.

An agent handling a churn-risk escalation for a $50,000 LTV enterprise account traces to the VP of Customer Success, the retention workflow, and that specific account. An agent answering a routine billing question traces to the operations budget and a different account tier.

Both paths are attributable before the cost is incurred. If any of the four identity components are missing, the agent does not get access to paid resources. This principle is simple: no identity, no execution.

A standard unit of work bundles each action into a single auditable record: what the agent did, what it cost, what business context it operated in, what outcome it produced, and who owns the bill.

Every enterprise domain already has this kind of atomic unit. Finance has the dollar in the ERP. HR has the employee in the HRIS. Sales has the deal in the CRM. Autonomous AI agents are invisible to the systems of record that run the rest of your business, so costs cascade through multi-agent chains without landing in any ledger.

A standard work record for each agent action changes the equation. Each step in a five-agent chain becomes individually accountable: attributable to an owner, connected to an outcome, and auditable after the fact.

Together, these two components make it possible to trace any dollar through any chain of handoffs and answer three questions: Who authorized this, what did it cost, and was it worth it?

Every New Agent Widens the Gap

Multi-agent architectures are scaling faster than the controls around them, and competitive pressure is accelerating the pace. But every agent you add to a production workflow increases the risk of monthly invoices no one can explain, budget conversations no one can resolve, and ROI questions no one can answer. The math compounds in one direction: more agents, more handoffs, more spend, less traceability.

The organizations that build the attribution layer now will scale multi-agent systems without budget overruns. Those that wait will discover the gap only when the bill arrives orders of magnitude higher than expected, with no record of what went wrong.

Revenium is the AI Economic Control System that traces every AI transaction to its business outcome and enforces economic boundaries before unattributed spend hits your P&L. Check out our report, The Financial Blind Spots in Autonomous AI, to see the complete framework.

Table of Contents

What Is FinOps for AI?

Ship With Confidence