Your Best Customers Are Your Biggest Loss Leaders (And You Don't Even Know It)

15 Jan 2026

John Rowell

[

CEO

]

Your Best Customers Are Your Biggest Loss Leaders (And You Don't Even Know It)

TL;DR: 77% of companies are paying more for AI despite falling tech costs. If you're running flat-rate pricing, your power users are consuming 60%+ of AI costs while paying the same as light users. Traditional SaaS pricing assumes predictable costs. AI breaks that model. The atomic unit isn't tokens or API calls, it's the situated job. Without runtime authorization to govern AI work before it happens, you can't stop a $200K customer from costing you $600K to serve. Dashboards show what happened. Governance prevents what's about to happen.

Your Best Customers Are Your Biggest Loss Leaders (And You Don't Even Know It)

Your enterprise customer just renewed for another year. ARR is up. The board is happy. Your CFO doesn't know you're losing money on every request they make.

Here's the number that should terrify you: 77% of companies are paying more for AI, even as technology costs fall. [1]

That statistic comes from Google Cloud's 2025 ROI of AI report, surveying 3,466 executives across industries. The 77% statistic tells us that usage is exploding faster than prices are dropping. Companies are consuming AI at rates that obliterate any savings from cheaper models.

And if you're running a flat-rate pricing model, you're subsidizing your power users with everyone else's revenue.

The Hidden Subsidy Nobody Talks About

Let's lay out what's really happening behind your P&L.

Customer A pays $50K/year for your platform. They use $180K in AI compute.

Customer B pays $50K/year for the same platform. They use $8K in AI compute.

You're charging them the same.

This isn't a rounding error. SaaS companies have priced software the same way for twenty years. The model assumes predictable, bounded costs. AI breaks that assumption completely.

10% of your users are driving 60%+ of your AI costs. [1] These accounts lose enough margin to subsidize your entire infrastructure.

Where do these costs hide? Everywhere finance teams aren't looking:

Inference costs buried in "cloud infrastructure overhead"
Vector database queries categorized as "storage and retrieval"
Model inference calls disguised as "platform services"
LLM request costs categorized under "platform services"
Context window expansion that scales non-linearly with usage

Finance sees a clean AWS bill. Engineering sees inference requests. Nobody sees the atomic unit of cost: the situated job—the specific work a specific user is trying to do, with specific economic consequences. [2]

Why Flat-Rate Pricing Worked (And Why It's Breaking)

Flat-rate SaaS worked for CPU and storage because costs were predictable and variance was low.

A user generating 100 reports costs roughly the same as a user generating 1,000 reports. Storage scales linearly. Compute is cheap and bounded. The heaviest user might consume 3–5x what the lightest user does. You price for the median, absorb the edges, and call it product-market fit.

AI is different. A single customer query can trigger $50 in compute costs, or $0.02. You can't predict it because:

The complexity of the query varies wildly
The agent's reasoning path is non-deterministic
Multi-step workflows spawn sub-agents that recurse
Context windows balloon as conversations extend
Tool calls cascade across external APIs with their own pricing

The math that breaks everything. Flat pricing + unpredictable AI consumption = subsidizing power users with light users' revenue.

And it compounds. As your product improves and your AI becomes smarter and more autonomous, your costs increase rather than decrease.

Your Best Customers Are Destroying Your Margins (And You Can't Stop Them)

Your power users love unlimited AI. They're your champions. They write case studies. They expand into other departments.

They're also running jobs you can't afford.

The real problem is that you can't govern what you can't see as work.

Finance teams try to track "tokens" or "API calls," but those aren't the atomic unit of cost. A token tells you nothing about business value. A call tells you nothing about who authorized the spend or why.

The atomic unit is the Situated Job: the specific work a specific user is trying to do, in a specific context, with specific economic consequences. [2]

Until you can see, measure, and price at the job level, you don't have a visibility problem, you have an authority problem.

Who authorized this spend? Who decided this job was worth running? When a customer triggers a $200 inference job to generate a summary, who has the authority to say "not at that price"?

Right now, the answer is nobody.

Why Visibility Isn't Control

Your CFO can see the AWS bill. Your engineers can see the Datadog dashboard. Your product team can see the feature usage metrics.

Nobody can stop the bleeding.

Dashboards show you what happened. They don't prevent what's about to happen.

This is the gap that's quietly bankrupting AI business models. It’s the difference between observability and governance.

Observability tells you an agent just spent $1,200 running a recursive loop for 11 days. [3] Governance would have stopped it at $100.

The missing primitive in AI economics is runtime authorization. It’s the ability to govern work before it consumes resources, not after. [4]

This is economic governance, not FinOps.

And the need is urgent. Google's research shows that 52% of enterprises have deployed agentic AI, and 39% are managing 10 or more agents. [1] Agents don't sleep. They don't take breaks. They recurse. They spawn subtasks. They query, synthesize, and act autonomously.

A single agent can trigger dozens of model calls, each with variable cost, in seconds.

Average AI spend is now 26% of IT budgets, roughly $250,000 per 1,000 employees. [1] But that average hides a brutal distribution. Some customers are at $100K. Others are at $500K. And under flat-rate pricing, you're treating them the same.

Can You Answer These 4 Questions Right Now?

If you're scaling AI features, you need to answer these before your next board meeting:

1. Which customer jobs are profitable and which are subsidized?

Not "which customers use AI the most." Which specific jobs—summarization, analysis, generation, search—are margin-positive versus margin-negative? If you can't break it down to the job level, you can't optimize it.

2. Can you set an economic threshold and enforce it at runtime?

Not a cost alert that fires after the damage is done. A pre-execution gate that says "this job will cost $X, do you want to proceed?" Can your system ask that question? Can it enforce the answer?

3. When a user triggers a $200 inference job, who has authority to stop it?

Is it the user? The account owner? The platform? Is the authority embedded in your system, or is it a Slack message sent after the invoice arrives?

4. If you can't answer these, how do you scale AI profitably?

Because your competitors are figuring this out. The companies that instrument AI economics now, the ones that can measure cost-per-job, enforce thresholds, and prove ROI at the feature level, will have a defensible moat. The ones that don't will have "AI initiative" budget line items they can't explain.

The Bottom Line

Your margins are bleeding out in real time.

Most FinOps teams can already calculate cost-per-customer. Export a CSV, run a pivot table, and you have attribution. That's reporting.

What matters is whether your system can authorize work in real time. Whether it has the economic intelligence to govern AI jobs before they consume resources, based on who's running them and what they'll cost.

Because the alternative is discovering, six months from now, that your best customer, the one paying $200K/year, just cost you $600K to serve.

And by then, it's too late.

Table of Contents

What Is FinOps for AI?

Ship With Confidence