The Diligence Stack - By Creative Strategies

The Diligence Stack - By Creative Strategies

AI Server Demand Is Becoming Three Markets

Ben Bajarin's avatar
Ben Bajarin
Jun 16, 2026
∙ Paid

Companion note. This report is the supply-and-capacity half of a two-part research program. It sizes how much AI server hardware gets built and who owns it. The companion note, “Confidential AI: The Permission Layer for Enterprise and Sovereign Infrastructure” (June 2026), takes the demand-conversion half, namely whether the regulated and sovereign share of that capacity is allowed into production and what that does to revenue quality. The two notes meet in the enterprise and sovereign segment, where capacity sizing and permission economics turn out to be the same question asked from opposite ends.

Most AI server numbers in circulation still try to measure one market. We have increasing conviction that is no longer the right way to analyze what is happening. The buildout is separating into three markets, and the line between them is better explained by ownership than location.

Ownership is the cleaner starting point.
Cloud versus on-premises, the framing inherited from the last server cycle, has become the least useful question you can ask about an AI rack. What decides the economics now is who owns or finances the hardware, who consumes the compute, and where the capacity sits in the value chain. Those questions increasingly return three different answers for the same rack.

This is where the double counting starts.
Walk a single cluster through the value chain and you can watch it happen in real time. A neocloud raises debt against a long-term contract, buys GPU systems from a contract manufacturer, and installs them in a leased, powered building. A hyperscaler contracts that capacity for several years. A model lab consumes the compute through the hyperscaler. The manufacturer books a cloud AI server sale, the neocloud reports the asset and its backlog, the hyperscaler reports the lease commitment, and the model lab announces a multi-gigawatt pipeline.

One cluster, turns into four reporting streams. Add them together, which is what a lot of market sizing quietly does, and you get a total addressable market that does not physically exist.

The three ownership buckets.
Modeling by owner fixes this, because every dollar of server hardware has exactly one owner even when four parties touch it. That gives us three mutually exclusive buckets:

  • Hyperscaler-owned hardware: capacity the largest cloud platforms build and run for themselves.

  • Neocloud and third-party AI factories: GPU clouds and hosted-compute operators that own infrastructure and rent it out.

  • Enterprise and private AI factories: systems companies, governments, and regulated industries buy and run inside their own walls.

Consumption still earns its place, just in a separate view that tells us how hard the capacity is working and how durable the demand is. It should never be folded back into the size of the market.

Power capacity needs the same discipline.
Gigawatts are also not generic demand signals. A contracted megawatt tied to a hyperscaler campus, a neocloud balance sheet, or a sovereign/private AI factory carries different deployment risk and should not be interpreted the same way. AI server sizing needs a more precise view of power capacity because megawatts determine what can physically deploy, while ownership determines where the hardware is counted.

The marginal dollar is moving outward.
Hyperscalers remain the center of gravity and we believe they will stay there, but the marginal dollar of growth is migrating outward. Neoclouds have gone from a rounding error to a financed capacity layer large enough that ignoring it, or attributing its hardware to the hyperscalers who rent it, throws the whole model off. Enterprise demand is the hardest segment to see cleanly, but it is also the segment where the economics are changing in a way we think the market is under-modeling.

Why we size enterprise/private AI factories as a new layer of server infrastructure.
A class of enterprises will not want every token to be a metered cloud token. As AI moves from experimentation to persistent internal workflows, token costs become an operating variable that companies will manage. Some workloads will stay in public cloud because the flexibility is worth paying for. Others will move closer to the enterprise because sustained utilization, improving local compute, smaller and more efficient models, and maturing on-prem software stacks make owned capacity more attractive.

The goal is not to bring every AI workload inside the company. The goal is to generate more local tokens where the workload is steady, sensitive, and valuable enough to justify the infrastructure.

Where this connects to permission.
Enterprises own AI infrastructure where governance, sovereignty, latency, sustained utilization, and token-cost control make ownership the approvable option. The workloads with the best returns are often the same workloads a compliance team is least willing to run anywhere else. Capacity can be financed and powered, and still sit unused if no one inside the customer is allowed to put sensitive data on it.

Confidential AI: Turning Trust Into AI Infrastructure Revenue

Confidential AI: Turning Trust Into AI Infrastructure Revenue

Ben Bajarin
·
Jun 11
Read full story

Sizing the capacity is only key one question. Whether the regulated and sovereign share of it gets permission to run is the other. They are the same question asked from opposite ends of the spectrum.

The scale is large and growing fast, a market in the hundreds of billions of dollars a year and on a path toward the trillion-dollar range by the end of the decade. The level matters less than the structure, because the structure is what tells us which businesses—and which infrastructure players—capture the growth. This in turn informs where the revenue is durable, and where the same hardware is being counted twice. That structure, and the model behind it, is what the full report is for.

Inside the full report

The subscriber edition turns this framing into a working market model. It includes:

  • The full 2025–2030 forecast for all three ownership segments, with explicit low, base, and high scenario ranges rather than a single point number.

  • The capex-to-server-hardware bridge that reconciles disclosed hyperscaler capex to deployed, owner-based server TAM, with every adjustment shown so you can argue the assumptions.

  • Three segment deep dives covering hyperscaler, neocloud, and enterprise/private, each built from the company backlogs, capex guidance, contracted power, and channel data behind it.

  • The ownership-versus-consumption matrix, the discipline that keeps model-lab pipelines and leased capacity from being counted twice.

  • Architecture and cost-stack economics, including cost per gigawatt across NVIDIA and custom-ASIC systems and why the architecture mix is a first-order driver of the dollar TAM.

  • Forecast confidence grades by segment, naming the single variable most likely to move each one.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Creative Strategies, Inc. · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture