The Inference Payback

What Has to Be True for AI Factories to Fund Themselves

Ben Bajarin

May 14, 2026

∙ Paid

From AI Usage to AI Earnings Power

Ben Bajarin

May 5

Read full story

From Model Wars to Platform Wars

Ben Bajarin

May 7

Read full story

The E/AI Index: Budget Architecture and the Next Phase of Enterprise AI Adoption

Ben Bajarin

May 12

Read full story

Our last several reports worked through the customer side of AI monetization. The AI ROI report looked at where AI creates measurable workflow value. Platform Wars asked which layer captures the economics once those workflows move into production. The E/AI Index moved into budget architecture, showing how CIOs and CTOs are funding AI through new budgets, reallocation, and deployment maturity.

This report moves to the infrastructure side of the same cycle. If customers are finding ROI, platforms are competing to own the workflow layer, and CIOs are forming AI budgets, the next question is whether the factories underneath that demand can earn their cost of capital. The buildout now has to show that consumption can carry compute, power, leases, depreciation, financing, and refresh cycles.

We remain constructive on AI demand. Models are improving, user bases are expanding, enterprise experimentation is deeper than it was a year ago, and the largest platforms continue to commit capital at a scale that only makes sense if they believe usage keeps compounding. Demand growth, on its own, is no longer the hard part of the argument. The harder part is understanding what kind of demand it is, who pays for it, and whether it produces enough gross profit per MW to carry the infrastructure behind it.

That is why we keep coming back to profitable demand density per MW. Training capital was easier to justify when the main objective was capability: better models, scarce compute access, and position on the capability curve. Inference has a different burden. It is where products expose demand, customers test pricing, and fixed infrastructure cost has to be absorbed by workloads that either generate revenue directly or create enough strategic value to justify the capacity they consume.

Token growth can come from very different places, and those differences are where the economics get messy. A paid API call tied to an enterprise workflow is not the same as a free consumer query, a discounted batch job, an internal eval run, or a long chain of hidden reasoning tokens that never appears directly to the user. Each consumes compute. Only some produce revenue with enough pricing power and margin to help pay for the factory behind it. That is the mix problem. Token volume can look healthy while realized pricing, serving cost, utilization quality, or revenue traceability all move in the wrong direction.

Enterprise spend is an important part of the answer, and it should be one of the cleaner early proof points. The CIO budget work gives us more confidence that real enterprise AI budgets are forming. We would still be careful about treating those budgets as enough to fund the full hyperscaler AI buildout on their own. The infrastructure case needs several things to work at the same time: paid enterprise workflows, consumer monetization, internal product lift, platform services, high utilization, lower cost per token, sustained premium demand, and disciplined capital deployment. Enterprise AI TAM is too narrow a denominator for that broader payback question. The better unit of analysis is profitable AI revenue and strategic value per MW.

For our diligence work, this moves the tracking burden lower in the stack. Capex, product adoption, and revenue growth are useful, but they are still too high level to answer the payback question. The harder work is estimating realized pricing, workload mix, utilization, depreciation pressure, lease exposure, and the rate at which hardware and software improvements reduce cost per token. A high-usage world can still produce weak payback if too much of the demand is free, discounted, internal, or expensive to serve.

Our subscriber report builds the model behind that view. It separates token growth from profitable inference demand, uses revenue per MW and gross profit per MW as the operating lens, and tests what has to be true across hyperscalers, frontier labs, neoclouds, and private AI factories. The goal is to identify which operators have enough control over power, silicon, pricing, distribution, utilization, and balance sheet risk to make the math work.

In the full report, paid subscribers get:

A payback model for separating token growth from profitable inference demand.
The token demand funnel: total demand, monetizable demand, and profitable demand.
A revenue-per-MW lens for comparing hyperscalers, frontier labs, neoclouds, and private AI factories.
Scenario work showing what has to be true for inference to fund the infrastructure buildout.
A treatment of why enterprise AI budgets are only one part of the payback stack.
A breakdown of why internal inference can create strategic value and weaken clean revenue-per-MW math.
A view on which operators have structural advantages through power, silicon, distribution, pricing, and balance sheet capacity.
A monitorable set of signals for pricing, utilization, capex/MW, GP/MW, depreciation, lease exposure, refinancing risk, and the next 12 to 24 months of evidence.

The Inference Payback

What Has to Be True for AI Factories to Fund Themselves

From AI Usage to AI Earnings Power

From Model Wars to Platform Wars

The E/AI Index: Budget Architecture and the Next Phase of Enterprise AI Adoption

In the full report, paid subscribers get:

This post is for paid subscribers