<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The Diligence Stack - By Creative Strategies]]></title><description><![CDATA[The Diligence Stack delivers analyst-grade intelligence on AI’s full-stack impact, connecting semiconductors, infrastructure, platforms, software, and adoption to show how technical change reshapes markets and business models.]]></description><link>https://www.thediligencestack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!at7f!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png</url><title>The Diligence Stack - By Creative Strategies</title><link>https://www.thediligencestack.com</link></image><generator>Substack</generator><lastBuildDate>Sun, 28 Jun 2026 13:35:51 GMT</lastBuildDate><atom:link href="https://www.thediligencestack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Creative Strategies, Inc.]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[creativestrategies@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[creativestrategies@substack.com]]></itunes:email><itunes:name><![CDATA[Creative Strategies]]></itunes:name></itunes:owner><itunes:author><![CDATA[Creative Strategies]]></itunes:author><googleplay:owner><![CDATA[creativestrategies@substack.com]]></googleplay:owner><googleplay:email><![CDATA[creativestrategies@substack.com]]></googleplay:email><googleplay:author><![CDATA[Creative Strategies]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Qualcomm's Second Platform Moment]]></title><description><![CDATA[Data center changes the category, and custom Arm CPU may be the early upside]]></description><link>https://www.thediligencestack.com/p/qualcomms-second-platform-moment</link><guid isPermaLink="false">https://www.thediligencestack.com/p/qualcomms-second-platform-moment</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 25 Jun 2026 15:07:32 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ihsk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Although The Diligence Stack is still new as a public research product, Qualcomm is not new coverage for us. Creative Strategies has followed the company for roughly 26 years, and we have long believed the data center opportunity represented the next logical extension of Qualcomm's engineering capabilities. Two pieces of history are worth remembering before getting into this report. First, Qualcomm's 2017 Centriq Arm server CPU was widely regarded by many industry contacts we spoke with as technically strong, arriving before the Arm server ecosystem was commercially ready. Today, that environment has been fundamentally validated by hyperscalers. Second, Broadcom's 2017-2018 hostile takeover attempt reinforced our view that Qualcomm's engineering talent, IP portfolio, and technical capabilities were more valuable than the market was giving them credit for. </em></p><p>We attended Qualcomm's Investor Day, spent time with management, and participated in the Q&amp;A with Akash Palkhiwala, Cristiano Amon, and Tony Pialis. We came away believing the data center opportunity is now considerably more tangible than many previously appreciated. Positioning wise, we still view Qualcomm first as a semiconductor engineering company, but data center has become the next platform leg of that engineering story.</p><p>Qualcomm used Investor Day to put a different revenue curve in front of investors. The company had already been moving beyond handsets through automotive, IoT, PC, XR, and edge AI, but that version of the story, on the surface, still looked like offset work. Automotive and IoT could help absorb Apple modem loss, Samsung mix pressure, memory-driven Android weakness, and periodic QTL renewal concern. That improved the quality of the business, but it did not force investors to place Qualcomm in a different category. <br><br><em>Chart for visual effect. We detail our entire model and assumptions for each revenue case in the full report and our estimate scenarios carry out to 2030. </em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ihsk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ihsk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png 424w, https://substackcdn.com/image/fetch/$s_!Ihsk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png 848w, https://substackcdn.com/image/fetch/$s_!Ihsk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png 1272w, https://substackcdn.com/image/fetch/$s_!Ihsk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ihsk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png" width="1456" height="801" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:801,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:265451,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thediligencestack.com/i/203540753?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ihsk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png 424w, https://substackcdn.com/image/fetch/$s_!Ihsk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png 848w, https://substackcdn.com/image/fetch/$s_!Ihsk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png 1272w, https://substackcdn.com/image/fetch/$s_!Ihsk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3a66b2a3-b430-4107-82ec-4790cd98d362_2640x1452.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As the chart above visualizes, the data center framework changes Qualcomm&#8217;s growth trajectory. Management raised its FY29 non-handset QCT target to $40B, roughly twice the prior FY29 target, and put a more than $15B FY29 data center target inside that number. CFO Akash Palkhiwala also made the mix change direct: by FY29, handsets are expected to fall toward roughly one-third of QCT revenue. That is the cleanest stat from the day on how the diversity story has evolved with data center now in the picture. Qualcomm can still be one of the most important mobile silicon companies in the world, while the investment argument increasingly depends on whether it becomes an edge-to-data-center AI compute platform.</p><p>We came away from the event and follow-up Q&amp;A believing the data center narrative is more concrete than many investors assumed going in. The FY27 anchor is custom silicon-led, with two global hyperscale customers each contributing at least $1B according to management&#8217;s Q&amp;A comments and both with multi-generational programs planned. The product cadence then layers in HBC-based AI acceleration and C1000 server CPUs. This sequence reduces the burden on any one product line. Custom silicon creates the first revenue line, CPU creates the more lasting hold on the socket, HBC gives Qualcomm an inference architecture of its own, and Alphawave adds the I/O, die-to-die, SerDes, optical, and chiplet assets that make the platform story more credible.</p><p>The custom Arm CPU point deserves more attention than it has received. Agentic AI increases host-side work inside the data center. Tool calls, retrieval, API routing, state management, security, scheduling, and accelerator coordination all run through the CPU complex. If hyperscalers want a standard Arm server CPU path, Arm CSS is available. If they want a more specialized Arm CPU with custom cores, chiplets, high-speed I/O, memory attach, and implementation help, the external partner list narrows quickly. Qualcomm&#8217;s Oryon work, architecture-license position, mobile-to-auto CPU experience, and Alphawave connectivity assets give it a credible claim to that role.</p><p>HBC is the second technical piece to understand. High Bandwidth Compute is Qualcomm&#8217;s answer to the inference memory bottleneck. Traditional accelerator systems spend power and packaging budget moving data between compute and external memory stacks. Qualcomm&#8217;s approach places the XPU under DRAM stacks so the compute sits closer to memory. The claim is SRAM-like performance with DRAM-class density, with better bandwidth per watt and capacity per watt across different inference workloads. The business read-through is cost per token, rather than a generic accelerator benchmark. We view this as interesting, needing further proof, but we know much of the industry has been circling around how to do near memory compute, mostly in RND, so this could be validation for that approach which will also make LPDDR a strategic part of compute packaging. </p><p>Our scenario model frames the change. The bear case takes Qualcomm to roughly $61.5B of FY29 revenue with $10B from data center. The base case, which largely follows management&#8217;s Investor Day framework, reaches roughly $73.6B of FY29 revenue with $15B from data center. The bull case reaches roughly $89B of FY29 revenue with $22B from data center, driven mainly by custom Arm CPU absorption and HBC/connectivity attach above the initial guide. In the base case, Qualcomm grows well beyond the pre-event low/mid-$40B revenue framing. In the bull case, the business is roughly twice its FY25 revenue base by FY29.</p><p>That is the reason we frame this as Qualcomm&#8217;s second platform moment. The first platform was mobile. The second is the attempt to extend Qualcomm&#8217;s compute, connectivity, and low-power design DNA into data center AI infrastructure while keeping the edge portfolio compounding. Data center is the growth driver. Automotive, industrial, robotics, personal AI, PC, XR, and QTL make the bridge less fragile. The diligence question is now whether $15B is a ceiling, or the first visible layer of a larger custom silicon platform business.</p><h1>Inside the full subscriber report</h1><ul><li><p>The pre- and post-Investor Day revenue bridge: why the old model looked like offset work and the new model changes the category.</p></li><li><p>A full scenario model through FY31, including bear/base/bull revenue paths and the EV/sales read-through at each path.</p></li><li><p>A data center stack that separates custom Arm CPU, custom ASIC services, HBC acceleration, and Alphawave connectivity/IP.</p></li><li><p>Why custom Arm CPU may be the more lasting upside layer if hyperscalers move beyond standard Arm CSS building blocks.</p></li><li><p>An explanation of HBC and why its economic value is tied to memory movement, bandwidth per watt, and cost per token.</p></li><li><p>A full FY29 business breakdown showing how auto, industrial, robotics, personal AI, PC, XR, and QTL contribute around the data center ramp.</p></li><li><p>What would change our view. The operating variables that would make us more constructive or force us to reduce the data center multiple credit.</p><p></p></li></ul>
      <p>
          <a href="https://www.thediligencestack.com/p/qualcomms-second-platform-moment">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[GPU Tsunami Beneficiaries: Power Semis and Analog ]]></title><description><![CDATA[How AI Racks Pull Power Semiconductors and Analog Control Into a New Demand Cycle]]></description><link>https://www.thediligencestack.com/p/gpu-tsunami-beneficiaries-power-semis</link><guid isPermaLink="false">https://www.thediligencestack.com/p/gpu-tsunami-beneficiaries-power-semis</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Tue, 23 Jun 2026 15:28:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ilTa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We call what has happened in the semiconductor industry the GPU Tsunami. The GPU + AI was the initial shockwave and the chart below is the visual of the aftershock. Historical semiconductor cycles, which were usually tied to PCs, handsets, memory, or industrial and automotive demand at kept the industry at steady but slow growth. The GPU+AI moment shocked the industry and the entire semiconductor industry is growing at an unprecedented rate. The GPU is still the fuel, the center of gravity, but the demand it creates travels through hundreds of layers of the semiconductor supply chain: memory, substrates, packaging, networking, timing, power delivery, analog control, passives, thermal systems, test, WFE, and the mature-node capacity that supports much of the physical infrastructure around the accelerator. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ilTa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ilTa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png 424w, https://substackcdn.com/image/fetch/$s_!ilTa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png 848w, https://substackcdn.com/image/fetch/$s_!ilTa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png 1272w, https://substackcdn.com/image/fetch/$s_!ilTa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ilTa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png" width="1192" height="772" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:772,&quot;width&quot;:1192,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:194609,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thediligencestack.com/i/203105848?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ilTa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png 424w, https://substackcdn.com/image/fetch/$s_!ilTa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png 848w, https://substackcdn.com/image/fetch/$s_!ilTa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png 1272w, https://substackcdn.com/image/fetch/$s_!ilTa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F629f2b30-8b30-4446-a350-859ece5ec790_1192x772.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That is where the cycle starts to get more interesting for all industry stakeholders. The obvious AI beneficiaries have already been heavily debated and largely mapped by us and others. The more interesting question to us is which parts of the semiconductor supply chain are being pulled into a new growth curve even though they do not screen like AI businesses at first glance. Power semiconductors and analog control sit near the top of that list.</p><p>A GPU creates the load, but it cannot use electricity in the form it arrives from the grid. Power has to be converted, stepped down, regulated close to the die, sensed, protected, and monitored continuously. As AI racks move from roughly 100kW-class systems toward several hundred kilowatts and eventually toward megawatt-class designs, that power tree becomes more complex and more semiconductor-rich. Content per rack rises because the system needs more point-of-load regulation, more intermediate bus conversion, more protection, more telemetry, and more analog control to make the accelerator usable at density.</p><p>That point becomes easier to see when you look directly at the power delivery hardware around a next-generation GPU tray. The photo below shows dense capacitor banks sitting immediately adjacent to a Vera Rubin GPU tray. Those cans are not the regulators themselves, and the voltage markings should not be added up as a direct power calculation. A 16V or 63V marking is a component rating, not the rail&#8217;s wattage. But the density is still insightful for our thesis. Many dozens of 100&#181;F-class polymer capacitors sit beside each GPU power zone, likely supporting a mix of intermediate and local rails, absorbing fast current swings, reducing ripple, and giving the voltage regulators enough local energy storage to keep the GPU stable during abrupt workload transitions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8Gps!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8Gps!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8Gps!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8Gps!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8Gps!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8Gps!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg" width="768" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:768,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:350098,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.thediligencestack.com/i/203105848?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8Gps!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg 424w, https://substackcdn.com/image/fetch/$s_!8Gps!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg 848w, https://substackcdn.com/image/fetch/$s_!8Gps!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!8Gps!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09162dc5-b3da-4b16-a650-1d6e01f25cdd_768x1024.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is a Vera Rubin + Vera CPU compute trey in an HP Cray design. OEM/ODM builds will vary.</em></p><p>That is the physical version of the thesis. AI power content is not just moving into the PSU, the sidecar, or the facility layer. It is moving onto the tray, potentially the substrate, and closer to the accelerator. Each generation requires more regulated, sensed, buffered, protected, and telemetry-managed power within inches of the die. For analog and power semiconductor suppliers, rising rack power translates into a larger and more complex control problem. Every additional watt moving through the rack has to pass through layers of conversion, regulation, sensing, protection, and telemetry before it can be turned into usable compute.</p><p>Analog and power are different from the parts of the semiconductor industry investors tend to associate with fast AI scaling. Logic can often ride leading-edge process roadmaps, large platform concentration, and aggressive foundry capacity plans. Analog and power scale through a more physical supply chain: mature-node capacity, high-voltage process know-how, 200mm and 300mm power-device fabs, SiC and GaN material availability, thermal packaging, passives, magnetics, test, and long customer qualification cycles. The dirty secret is that analog often scales less cleanly than logic&#8212;it is also much harder from a design and engineering standpoint. It stays closer to the physics, where noise, heat, voltage behavior, layout, and process variation can determine whether a part works at spec. That makes the supply response slower when demand suddenly accelerates. Huge opportunity for agentic EDA here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;dcad55c6-4c85-4d9b-b531-eb2c40d49b81&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Agentic EDA and the Next Revenue Layer in Chip Design&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-06-09T14:59:17.707Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/22ac5bef-b5fc-4926-9d94-c97f7a2bc34e_2816x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/agentic-eda-and-the-next-revenue&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:200779710,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:2,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>That makes power and analog a fascinating layer in the giga cycle. ASPs can rise when supply tightens, but the better point is that these components become part of the deployment math&#8212;and&#8212;a highly specific engineering challenge. A rack can have the right accelerator, memory, and networking, and still be limited by whether the power delivery architecture can support the density. That is the part of the GPU tsunami we focus on in this report: how AI rack density pulls power semiconductors and analog control into a demand cycle that is still under-discussed relative to how much it affects the rest of the build.</p><p>While this report is focused on power and analog attach to compute racks, we have a full report on 800 VDC beneficiary ecosystem.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;5a6c9a1b-3e0b-4b61-9284-c79da7c2e065&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;800 VDC: The Inflection Point Reshaping Datacenter Power and AI Infrastructure&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-03-31T15:23:02.714Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!DnlG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd275433e-7f06-42f1-bc30-b4bd9a3046e2_1076x624.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/800-vdc-the-inflection-point-reshaping&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:192237874,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:14,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>What&#8217;s in the full report:</h2><p>For paying subscribers, the full report goes deeper into the model, the rack architecture, and the supplier map. Inside we cover:</p><ul><li><p>Our direct AI datacenter power semiconductor and analog control TAM model from 2025 to 2030.</p></li><li><p>Why stage 2 point-of-load power and intermediate bus conversion capture the largest share of direct semiconductor value.</p></li><li><p>How rack density changes the BOM as systems move from 100kW-class racks toward 600kW and 1MW-class architectures.</p></li><li><p>Why 48V and 54V distribution start to run out of room as current, copper, heat, and rack volume become limiting factors.</p></li><li><p>How 800V DC should be understood as a capacity architecture rather than only an efficiency upgrade.</p></li><li><p>The &#8220;power tree&#8221; from grid to core, including AC/DC, sidecars, IBC, point-of-load, protection, telemetry, and buffering.</p></li><li><p>The analog control layer: hot swap, eFuse, current sensing, isolation, digital power control, PMICs, and telemetry.</p></li><li><p>The role of BBU, CBU, and power smoothing as AI workloads become more dynamic.</p></li><li><p>A socket-level beneficiary map separating silicon / analog control, power modules / sidecar components, and datacenter power systems.</p></li><li><p>The risks and watch items that could change the slope of the thesis, including architecture timing, multi-sourcing, qualification cycles, and whether power vendors begin disclosing AI-specific revenue separately.</p></li></ul><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/gpu-tsunami-beneficiaries-power-semis">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Free Chart Friday: The Enterprise AI Operating Stack]]></title><description><![CDATA[Three reports, one scorecard, and who holds the pieces production AI needs]]></description><link>https://www.thediligencestack.com/p/free-chart-friday-the-enterprise</link><guid isPermaLink="false">https://www.thediligencestack.com/p/free-chart-friday-the-enterprise</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Fri, 19 Jun 2026 17:00:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!g_Tz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Enterprise AI is crossing from pilots into production, and that transition is what our last three reports have mapped from different angles. Read in order they answer three questions that decide whether a workload ships and who gets paid when it does: whether a sensitive workload can run inside an approved trust boundary at all, who owns the capacity it runs on, and which vendors capture the value as enterprises generate more tokens close to their own data. The scorecard below is where the three lines meet.</p><p>The first report started where the capacity models stop. Sizing AI as a supply problem, GPUs, power, packaging, capex, captures how much gets built, not whether the highest-value workloads are allowed to run. The data with the best returns is usually the data legal, compliance, and security teams will not expose to a multi-tenant cloud, so capacity can be financed and energized and still sit unused. We called that second variable permission. The argument was that confidential computing sits on that boundary as a conversion and pricing layer rather than a security line item: hardware-rooted trusted execution plus remote attestation turns trust into an artifact a compliance team can file and an auditor can check, and once that artifact exists, approval behavior changes and blocked workloads become consumed infrastructure. The lift runs two ways in our framework, a trust premium on workloads already heading to cloud and the larger conversion of regulated demand that could not run at any price before, <strong>which is why, for this equation particularly, the useful unit moves from tokens per watt to protected tokens per watt and a dollar cleared by compliance should behave differently in a price war than a dollar of experimentation.</strong></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;5b2a00d6-6530-41f5-b403-21c98d9abc6f&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Confidential AI: Turning Trust Into AI Infrastructure Revenue&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-06-11T16:35:37.555Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0467222c-ffd3-4708-8fe4-099d9bffbba8_2752x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/confidential-ai-turning-trust-into&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:201519863,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:9,&quot;comment_count&quot;:2,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>The second report took the same permission lens to the supply side and found that AI server demand is no longer one market. Once you model by who owns the hardware rather than where it sits, the buildout separates into three: hyperscaler-owned capacity, the neocloud and third-party AI factory layer, and the enterprise or private AI factory that companies, governments, and regulated industries run inside their own walls. A single GPU cluster financed by a neocloud, contracted by a hyperscaler, consumed by a model lab, and booked by an ODM lands in four reporting streams, and adding them together produces a market that does not physically exist. Our base case carries total demand from roughly $228 billion in 2025 toward $845 billion in 2030, with hyperscalers still the anchor near two-thirds of the market and the marginal dollar of growth migrating outward. The enterprise and private segment is the least observable of the three and the one we hold with the widest range, precisely because its growth is governed by the permission economics the first report priced. That is where the two notes become the same question asked from opposite ends.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;a42d1b49-691b-4c6b-89cf-c4c5d291db01&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;AI Server Demand Is Becoming Three Markets&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-06-16T15:37:59.210Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!4dPF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F036a3d2f-1ed7-4e5a-8296-47a53fa73d58_994x362.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/ai-server-demand-is-becoming-three&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:202194278,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:5,&quot;comment_count&quot;:3,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>The third report went inside that enterprise market and asked who captures the value as token generation moves closer to enterprise data. The workloads that qualify are the persistent ones, agents that run continuously and hold state, retrieval against proprietary corpora, fraud and compliance pipelines, the cases where steady utilization, data sensitivity, latency, and governance line up at once. For those deployments the private AI factory should not be read as a server sale. The server is the entry ticket, and the durable economics sit in what attaches behind it: storage, networking, power and cooling, security, confidential compute, management, software, financing, and services. That reframing turns the investment question into a revenue-quality question, because the same AI server dollar can sit on thin GPU pass-through or pull a higher-quality attach stack behind it, and presence somewhere in that stack is not the same as a strong position in it. Dell and HPE are the cleanest public test cases, and they attack the opportunity from opposite ends, Dell through AI-factory scale and storage pull-through and HPE through a networking-led private-cloud and operations layer. That distinction is the one the scorecard is built to make legible.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;53f9f33b-720a-4acb-b9ea-7283d6035aab&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Local Token Stack&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-06-18T17:29:59.641Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!yFu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F928a4692-a87a-434c-a261-f70d042cd097_1568x801.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/the-local-token-stack&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:202584207,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:3,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Put those three lines together and the question for an enterprise moving into production changes shape. It is not which single venue wins. Production AI runs hybrid and multicloud, across owned capacity, neocloud, and the hyperscalers at once, so the operative question is which vendors hold the operating-stack pieces, the governance, security, data, orchestration, and confidential and sovereign capabilities the series identified, that let an enterprise run agentic AI in production wherever it sits. The scorecard grades eighteen of those capability domains for breadth and ownership on our framework, not for revenue or valuation. Read that way it makes one structural point first: accelerated compute is the only domain where every vendor scores a full three, so the metal is a cleared baseline and the entire spread opens up above it, in exactly the operating and data layers the three reports said decide whether a workload reaches production. <em>Note, this is purely a capability graph not one scoring the quality of the capability.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g_Tz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g_Tz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png 424w, https://substackcdn.com/image/fetch/$s_!g_Tz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png 848w, https://substackcdn.com/image/fetch/$s_!g_Tz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png 1272w, https://substackcdn.com/image/fetch/$s_!g_Tz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g_Tz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png" width="1246" height="1032" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1032,&quot;width&quot;:1246,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:293639,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.thediligencestack.com/i/202741440?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!g_Tz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png 424w, https://substackcdn.com/image/fetch/$s_!g_Tz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png 848w, https://substackcdn.com/image/fetch/$s_!g_Tz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png 1272w, https://substackcdn.com/image/fetch/$s_!g_Tz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa4038789-aeab-4fd0-b16f-ddc0f5f52a3a_1246x1032.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Breadth favors the hyperscalers, with Azure separating at 53 of a possible 54 and its lone soft spot in autonomous network operations, the one domain where HPE scores a three. That same plane explains HPE&#8217;s 47 against Dell&#8217;s 40, a gap that sits almost entirely in orchestration, observability, networking, security, and FinOps while Dell&#8217;s owned strength runs through storage and data-protection resilience, which is the deployable-versus-operable split from the third report rendered as two different shapes rather than one ranking. The AI-native clouds play in a more specific lane for now, with CoreWeave and Nebius strong on compute, orchestration, and scheduling and thin on the data-platform and governance domains that regulated production demands, coverage profiles of 35 and 36 that read as infrastructure depth without enterprise breadth. IREN sits earliest at 22, its strength concentrated in the physical layer with the operating-stack build still ahead of it.</p><p>The point we want to land is what stack breadth represents. A vendor with owned or tightly integrated capability across compute, storage, data, networking, security, observability, and cost governance has more ways to capture regulated, sovereign, and enterprise workloads &#8212; and more ways to keep the attach revenue around those workloads. That is the line this series has followed from the start: revenue quality improves when the vendor captures more of the operating stack, and deteriorates when AI infrastructure remains a hardware pass-through cycle.</p><p>The scorecard is the compressed version of that argument. As production AI moves into a hybrid and multicloud world, the winning stacks will be the ones that make token generation governable, secure, metered, operable, and recoverable. The hardware still matters. The question is who owns enough of the environment around it to make the revenue durable.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.thediligencestack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The Diligence Stack delivers analyst-grade intelligence on the companies reshaping the technology landscape. The Diligence Stack is built around a systems-level view of AI. Because AI touches every aspect of a business, understanding it requires an interdisciplinary lens. We connect semiconductors, datacenter infrastructure, cloud platforms, frontier models, enterprise software, and customer adoption to understand how AI is reshaping technology markets, business models, and competitive advantage. </p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[The Local Token Stack]]></title><description><![CDATA[Who Benefits from Private AI Factories]]></description><link>https://www.thediligencestack.com/p/the-local-token-stack</link><guid isPermaLink="false">https://www.thediligencestack.com/p/the-local-token-stack</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 18 Jun 2026 17:29:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!yFu2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F928a4692-a87a-434c-a261-f70d042cd097_1568x801.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em><span data-color="rgb(102, 102, 102)" style="color: rgb(102, 102, 102);">This is the third note in our AI infrastructure series, and it builds on the first two. Report 1, &#8220;Confidential AI,&#8221; framed confidential computing as the permission and pricing layer for AI infrastructure &#8212; what decides which sensitive, regulated, and sovereign tokens can run inside an approved trust boundary, and why the relevant metric shifts from tokens per watt to protected tokens per watt. Report 2, &#8220;AI Server Demand Is Becoming Three Markets,&#8221; sized the buildout by ownership rather than location, separating hyperscaler-owned capacity, the neocloud and third-party AI factory layer, and the enterprise or private AI factory that companies, governments, and regulated industries buy and run inside their own walls. This report goes inside that third market. It sharpens which workloads actually justify local token generation and maps the beneficiaries &#8212; the companies that capture value as enterprises generate more tokens on infrastructure they own or control, close to their own data.</span></em></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;625d9ac0-5317-49a4-b215-ef66abe88698&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Confidential AI: Turning Trust Into AI Infrastructure Revenue&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-06-11T16:35:37.555Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0467222c-ffd3-4708-8fe4-099d9bffbba8_2752x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/confidential-ai-turning-trust-into&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:201519863,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:9,&quot;comment_count&quot;:2,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;93df8f42-d192-476c-851c-1cd9deced6d5&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;AI Server Demand Is Becoming Three Markets&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-06-16T15:37:59.210Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!4dPF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F036a3d2f-1ed7-4e5a-8296-47a53fa73d58_994x362.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/ai-server-demand-is-becoming-three&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:202194278,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:5,&quot;comment_count&quot;:3,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>As we have been laying out, most enterprise AI analysis still starts from the assumption that the enterprise is a pure cloud customer. That remains true for many workloads today, including AI, but it no longer captures the full direction of how we see this playing out. We are increasingly confident that certain AI workloads will work their way back toward infrastructure the enterprise owns, controls, or has dedicated access to. Our conversations from Dell Technologies World and HPE Discover reinforced that view.</p><p>We are not talking about every AI workload, and we are not arguing for a broad reversal back to the old on-prem server cycle. The workloads that start to qualify are the ones that become persistent internal processes: agents running in support, code assistants embedded into developer workflows, fraud systems scoring transactions continuously, or knowledge systems sitting close to proprietary enterprise data. Once those workflows run every day, token generation starts to behave less like experimental software consumption and more like an operating input.</p><p>That changes the buying conversation. Finance begins to care about recurring token cost. Security begins to care about where data moves and who can touch it. Infrastructure teams begin to care about utilization, latency, governance, and <strong>whether the workload is predictable enough to justify dedicated capacity</strong>. Public cloud still remains the better answer for burst, experimentation, frontier model access, and workloads where elasticity matters more than control. But the more repeatable the workload becomes, the more the enterprise begins to ask a familiar infrastructure question: if we use this capacity constantly, should we keep renting it by the unit or control more of the stack ourselves?</p><p>That is the on-prem qualification. The private AI factory is not a universal destination for enterprise AI. It is the infrastructure response to workloads where utilization, data sensitivity, governance, and workflow value line up. Where those variables line up, local token generation becomes easier to justify. Where they do not, public cloud remains the default.</p><h2><strong>The Workload Has To Earn Its Way On-Prem</strong></h2><p>We again need to emphasize, this shift is early and we are outlining problems we hear and challenges faced by enterprise customers as they see agentic AI get deployed in their enterprise.  This is not a sweeping &#8220;AI moves back on-prem&#8221; call, and it is not just a new label for the old enterprise server cycle. Public cloud has real advantages: elasticity, model availability, global reach, lower operating burden, and the ability to experiment without building dedicated infrastructure. For many enterprises, most AI will stay there.</p><p>However, as enterprises have begun to deploy agentic AI, it is clear classes of workloads have brought them on a path to ask new questions and think about longer term strategies. When a workload runs continuously, touches sensitive data, and creates enough internal value, the economics start to change. A support agent running all day, a developer assistant embedded into the engineering workflow, or a fraud system scoring transactions continuously has a different utilization profile than a pilot project. At steady utilization, the customer eventually asks the same question it has asked in every compute cycle: is this cheaper to rent by the unit, or control directly because we use it constantly?</p><p>That is data-center utilization logic applied to tokens. The language is new because the unit is new, but the underlying infrastructure math is the same. The asset used occasionally is usually easier to rent. The asset used constantly eventually invites an ownership or dedicated-capacity discussion.</p><h2><strong>Why The Local Case Is Getting Better</strong></h2><p>Local inference is getting cheaper as accelerators improve and smaller models become good enough for defined production tasks. The software stack for running inference on owned hardware is also becoming more deployable. All important factors because many enterprises do not want to engage in non-ROI enabling exercises. They need something their infrastructure teams can operate, govern, secure, and support.</p><p>Data gravity strengthens the local case. Many enterprise workflows already sit close to internal databases, file systems, identity layers, permission structures, and proprietary data. Moving that data to an external model can be expensive, slow, and operationally awkward. In some cases, the challenge is less cost than approval. For regulated, sovereign, or IP-sensitive workloads, the ability to prove where the data runs, who can touch it, and how the system is governed can determine whether the project moves into production at all.</p><p>That is why cost-per-token is only part of the discussion. Control, auditability, latency, data movement, and workflow integration all matter. None of those variables moves every workload into controlled infrastructure, but where they line up, the local-token case becomes much easier to assess.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;b672885c-ff1c-4b54-9b43-65bb47570e51&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Agentic AI Storage Shock&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-21T15:27:52.131Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dEqI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66aeade-08be-401e-a4f4-e4b8da29998d_1800x1050.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/the-agentic-ai-storage-shock&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:198594825,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:20,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2><strong>Agents Are The Utilization Case</strong></h2><p>The workload category we are most focused on is persistent internal agents. These systems do not behave like occasional chatbot sessions. They run continuously, call tools, query databases, maintain state, and interact with enterprise systems across many steps. That makes them one of the cleaner utilization cases for owned or controlled AI capacity.</p><p>It also makes the governance problem harder. A chatbot can be controlled at the application boundary. An agent that retrieves, reasons, acts, and writes back across internal systems creates a different kind of control challenge. It touches more systems, creates more audit requirements, and raises the importance of identity, permissions, observability, and policy enforcement.</p><p>That does not mean every agent needs to run locally. We would be careful about making that leap. The more grounded point is that the highest-control agentic workflows are likely to be among the first places enterprises ask for dedicated AI infrastructure. When agentic AI becomes a real production architecture, rather than a proof of concept layer, it could pull token generation toward controlled capacity faster than any single regulated vertical does.</p><h2><strong>The Server Is Only The Anchor</strong></h2><p>The private AI factory should not be modeled as a server sale as it once was. The server anchors the deployment, but the investment question is the attach stack around it: storage, networking, data protection, power and cooling, security, operating software, financing, lifecycle services, and integration work. For every dollar of AI server hardware, the diligence question is how many additional dollars follow, what margin they carry, and how repeatable the deployment becomes.</p><p>This is where backlog size can mislead. A large AI server backlog may say more about supply-chain access than durable earnings power. Backlog quality depends on what attaches to the compute sale and whether that attach turns into a broader operating layer around local token generation. A vendor that sells the GPU box and stops there may get revenue, but the quality of that revenue is different from a vendor that captures storage, networking, services, software, financing, and lifecycle management behind the same deployment.</p><p>That distinction is important because private AI infrastructure could either become another lower-margin hardware cycle or a more durable enterprise platform cycle. The difference will come down to attach, utilization, repeat deployments, and margin behavior as the market scales.</p><h2><strong>Dell And HPE Are The Clearest Test Cases</strong></h2><p>Dell and HPE are the two clearest public test cases, but they are approaching the market from different parts of the stack. Dell is the scale, storage, financing, and full-rack local-token platform case. Its advantage starts with distribution, supply-chain scale, AI server volume, and a large installed storage footprint. The strategic read is that Dell is trying to turn AI server demand into a broader enterprise AI Factory motion, with compute pulling storage, data management, services, financing, and lifecycle attach behind it.</p><p>HPE is taking a different route. We read HPE as more of a networking-led private cloud integration case. Juniper, Aruba, GreenLake, and its sovereign enterprise focus give it a different wedge into the same customer problem. In HPE&#8217;s version of the thesis, the network and operating layer become part of the control plane for private AI. That matters more if agentic systems force enterprises to rethink identity, policy, traffic flow, observability, and governance around AI workloads.</p><p>Both companies are aligning around the same broader shift, but from different starting points. Dell starts with the AI factory and tries to attach the stack behind it. HPE starts with networking, private cloud operations, and governance, then tries to pull compute and storage alongside that control plane. The full report goes deeper on where each company is advantaged, where the execution risk sits, and how the broader vendor map forms around this local-token stack.</p><h2><strong>Inside the full report</strong></h2><blockquote><p><span data-color="rgb(242, 101, 34)" style="color: rgb(242, 101, 34);">&#9642;</span><span> </span>A workload-by-workload suitability matrix, with our directional estimate of how much of each workload&#8217;s token generation lands on owned capacity versus public cloud.</p><p><span data-color="rgb(242, 101, 34)" style="color: rgb(242, 101, 34);">&#9642;</span><span> </span>The full private AI factory attach stack: the layers a single server sale pulls, who benefits in each, and the evidence that would confirm the attach is real.</p><p><span data-color="rgb(242, 101, 34)" style="color: rgb(242, 101, 34);">&#9642;</span><span> </span>The Dell versus HPE scorecard across nine dimensions, and where Lenovo&#8217;s hybrid and edge model fits without diluting the comparison.</p><p><span data-color="rgb(242, 101, 34)" style="color: rgb(242, 101, 34);">&#9642;</span><span> </span>Why Cisco&#8217;s networking opportunity is a security and observability control-plane story rather than switching alone, and how HPE through Juniper attacks the same problem from the other direction.</p><p><span data-color="rgb(242, 101, 34)" style="color: rgb(242, 101, 34);">&#9642;</span><span> </span>The agentic storage shift: why persistent AI turns storage from a passive repository into part of the inference loop, and reprices the data layer.</p><p><span data-color="rgb(242, 101, 34)" style="color: rgb(242, 101, 34);">&#9642;</span><span> </span>How to separate confidential-compute silicon from the software that monetizes it, and which layer actually captures the recurring revenue.</p><p><span data-color="rgb(242, 101, 34)" style="color: rgb(242, 101, 34);">&#9642;</span><span> </span>A revenue-quality ladder for grading any beneficiary&#8217;s AI revenue, plus a diligence checklist of what to watch and the bear case that mirrors it.</p></blockquote>
      <p>
          <a href="https://www.thediligencestack.com/p/the-local-token-stack">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[AI Server Demand Is Becoming Three Markets]]></title><link>https://www.thediligencestack.com/p/ai-server-demand-is-becoming-three</link><guid isPermaLink="false">https://www.thediligencestack.com/p/ai-server-demand-is-becoming-three</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Tue, 16 Jun 2026 15:37:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!4dPF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F036a3d2f-1ed7-4e5a-8296-47a53fa73d58_994x362.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em><strong>Companion note. </strong>This report is the supply-and-capacity half of a two-part research program. It sizes how much AI server hardware gets built and who owns it. The companion note, &#8220;<a href="https://www.thediligencestack.com/p/confidential-ai-turning-trust-into">Confidential AI: The Permission Layer for Enterprise and Sovereign Infrastructure&#8221; (June 2026)</a>, takes the demand-conversion half, namely whether the regulated and sovereign share of that capacity is allowed into production and what that does to revenue quality. The two notes meet in the enterprise and sovereign segment, where capacity sizing and permission economics turn out to be the same question asked from opposite ends.</em></p><p>Most AI server numbers in circulation still try to measure one market. We have increasing conviction that is no longer the right way to analyze what is happening. The buildout is separating into three markets, and the line between them is better explained by ownership than location.</p><p><strong>Ownership is the cleaner starting point.</strong><br>Cloud versus on-premises, the framing inherited from the last server cycle, has become the least useful question you can ask about an AI rack. What decides the economics now is who owns or finances the hardware, who consumes the compute, and where the capacity sits in the value chain. Those questions increasingly return three different answers for the same rack.</p><p><strong>This is where the double counting starts.</strong><br>Walk a single cluster through the value chain and you can watch it happen in real time. A neocloud raises debt against a long-term contract, buys GPU systems from a contract manufacturer, and installs them in a leased, powered building. A hyperscaler contracts that capacity for several years. A model lab consumes the compute through the hyperscaler. The manufacturer books a cloud AI server sale, the neocloud reports the asset and its backlog, the hyperscaler reports the lease commitment, and the model lab announces a multi-gigawatt pipeline.</p><p>One cluster, turns into four reporting streams. Add them together, which is what a lot of market sizing quietly does, and you get a total addressable market that does not physically exist.</p><p><strong>The three ownership buckets.</strong><br>Modeling by owner fixes this, because every dollar of server hardware has exactly one owner even when four parties touch it. That gives us three mutually exclusive buckets:</p><ul><li><p><strong>Hyperscaler-owned hardware:</strong> capacity the largest cloud platforms build and run for themselves.</p></li><li><p><strong>Neocloud and third-party AI factories:</strong> GPU clouds and hosted-compute operators that own infrastructure and rent it out.</p></li><li><p><strong>Enterprise and private AI factories:</strong> systems companies, governments, and regulated industries buy and run inside their own walls.</p></li></ul><p>Consumption still earns its place, just in a separate view that tells us how hard the capacity is working and how durable the demand is. It should never be folded back into the size of the market.</p><p><strong>Power capacity needs the same discipline.</strong><br>Gigawatts are also not generic demand signals. A contracted megawatt tied to a hyperscaler campus, a neocloud balance sheet, or a sovereign/private AI factory carries different deployment risk and should not be interpreted the same way. AI server sizing needs a more precise view of power capacity because megawatts determine what can physically deploy, while ownership determines where the hardware is counted.</p><p><strong>The marginal dollar is moving outward.</strong><br>Hyperscalers remain the center of gravity and we believe they will stay there, but the marginal dollar of growth is migrating outward. Neoclouds have gone from a rounding error to a financed capacity layer large enough that ignoring it, or attributing its hardware to the hyperscalers who rent it, throws the whole model off. Enterprise demand is the hardest segment to see cleanly, but it is also the segment where the economics are changing in a way we think the market is under-modeling.</p><p><strong>Why we size enterprise/private AI factories as a new layer of server infrastructure.</strong><br>A class of enterprises will not want every token to be a metered cloud token. As AI moves from experimentation to persistent internal workflows, token costs become an operating variable that companies will manage. Some workloads will stay in public cloud because the flexibility is worth paying for. Others will move closer to the enterprise because sustained utilization, improving local compute, smaller and more efficient models, and maturing on-prem software stacks make owned capacity more attractive.</p><p>The goal is not to bring every AI workload inside the company. The goal is to generate more local tokens where the workload is steady, sensitive, and valuable enough to justify the infrastructure.</p><p><strong>Where this connects to permission.</strong><br>Enterprises own AI infrastructure where governance, sovereignty, latency, sustained utilization, and token-cost control make ownership the approvable option. The workloads with the best returns are often the same workloads a compliance team is least willing to run anywhere else. Capacity can be financed and powered, and still sit unused if no one inside the customer is allowed to put sensitive data on it.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;a65e44de-957a-47b9-aead-c98be4f81b89&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Confidential AI: Turning Trust Into AI Infrastructure Revenue&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-06-11T16:35:37.555Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0467222c-ffd3-4708-8fe4-099d9bffbba8_2752x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/confidential-ai-turning-trust-into&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:201519863,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:9,&quot;comment_count&quot;:2,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Sizing the capacity is only key one question. Whether the regulated and sovereign share of it gets permission to run is the other. They are the same question asked from opposite ends of the spectrum.</p><p>The scale is large and growing fast, a market in the hundreds of billions of dollars a year and on a path toward the trillion-dollar range by the end of the decade. The level matters less than the structure, because the structure is what tells us which businesses&#8212;and which infrastructure players&#8212;capture the growth. This in turn informs where the revenue is durable, and where the same hardware is being counted twice. That structure, and the model behind it, is what the full report is for.</p><h2>Inside the full report</h2><p>The subscriber edition turns this framing into a working market model. It includes:</p><ul><li><p>The full 2025&#8211;2030 forecast for all three ownership segments, with explicit low, base, and high scenario ranges rather than a single point number.</p></li><li><p>The capex-to-server-hardware bridge that reconciles disclosed hyperscaler capex to deployed, owner-based server TAM, with every adjustment shown so you can argue the assumptions.</p></li><li><p>Three segment deep dives covering hyperscaler, neocloud, and enterprise/private, each built from the company backlogs, capex guidance, contracted power, and channel data behind it.</p></li><li><p>The ownership-versus-consumption matrix, the discipline that keeps model-lab pipelines and leased capacity from being counted twice.</p></li><li><p>Architecture and cost-stack economics, including cost per gigawatt across NVIDIA and custom-ASIC systems and why the architecture mix is a first-order driver of the dollar TAM.</p></li><li><p>Forecast confidence grades by segment, naming the single variable most likely to move each one.</p></li></ul><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/ai-server-demand-is-becoming-three">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Confidential AI: Turning Trust Into AI Infrastructure Revenue]]></title><description><![CDATA[How confidential compute turns verifiable trust into workload conversion, premium pricing, and higher-quality AI infrastructure revenue.]]></description><link>https://www.thediligencestack.com/p/confidential-ai-turning-trust-into</link><guid isPermaLink="false">https://www.thediligencestack.com/p/confidential-ai-turning-trust-into</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 11 Jun 2026 16:35:37 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0467222c-ffd3-4708-8fe4-099d9bffbba8_2752x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Note: Infrastructure decisions will shape AI revenue quality. Confidential compute is the first obvious example. As enterprise AI moves from generic workloads to regulated and sovereign workloads, the relevant question becomes less about raw tokens alone and more about which tokens can be processed inside an approved trust boundary. Tokens per watt will remain an important infrastructure metric. We think protected tokens per watt becomes an emerging competitive dynamic, because it captures whether a provider can deliver usable inference capacity for workloads legal, security, and compliance teams will actually approve.</em><br><br>Rightly so, a lot of work on AI infrastructure, ours included, has analyzed it largely as a supply problem. The value of the unit of compute being a systems level, full scale optimized design, and the capex required to assemble all of it still holds relevance in the analysis. But as enterprise and sovereign adoption scale, a second variable keeps showing up in our research that does not appear in any capacity table &#8212; permission. The highest-value AI workloads depend on exactly the data that legal, compliance, and security teams are least willing to expose to a multi-tenant cloud or a third-party model provider. Which means capacity can exist, be financed, and be energized, and still sit unusable for the workloads that justify the spend, because nobody inside the customer organization has the authority to approve the data exposure.</p><p>This report argues that confidential computing sits at exactly that boundary, and that it functions as a conversion and pricing layer for AI infrastructure rather than another security budget line. To be clear about what this is: an AI infrastructure revenue-quality report, not a cybersecurity report.</p><h2>The mechanism</h2><p>Enterprises spent two decades building controls, certifications, and audit language around two states of data: at rest and in transit. AI puts pressure on that model because the value is created in a third state, while the data is in use. During inference and agent execution, prompts, model weights, retrieved context, and agent credentials all sit decrypted in memory, where the cloud operator, hypervisor, or another privileged software layer could in principle reach them. Hardware-enforced trusted execution environments, now present in server CPUs and recent GPU generations, are designed to close that gap. Remote attestation then adds the piece that matters commercially: a signed evidence artifact a compliance team can file, an auditor can check, and a regulator can review. Once that artifact exists, approval behavior can change. Approval behavior is what turns blocked workloads into consumed infrastructure.</p><p>For our broader thesis on how enterprise adoption plays out, with the overarching platform/control plane layer, confidential AI is a key component. See our report here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;8f59a03c-7a31-4b79-a8bc-fb34f15ae5d7&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;From Model Wars to Platform Wars&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-07T14:57:20.077Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!KJrr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e2b1e42-167f-4c60-b251-b3b58216bf91_2200x1362.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/from-model-wars-to-platform-wars&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:194572162,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!at7f!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8eb90428-a00e-4b29-a979-0d47d3bf0802_612x612.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>Two findings worth sharing</h2><p>The first is about pricing power. Our research indicates confidential compute instances have commanded premiums on the order of 20 to 30 percent over comparable standard instances in regulated environments, a working estimate we are still tracing to SKU-level pricing. The interesting part is what kind of premium it is. A scarcity premium on GPUs erodes as capacity gets added; that is just supply. A trust premium rests on regulation, audit requirements, and customer risk tolerance, none of which follow semiconductor supply curves down. Generic compute will always face commoditization risk, but verified isolation gives providers a feature customers can pay for.</p><p>That is where protected tokens per watt becomes a useful lens. Standard AI infrastructure is judged by how much inference it can deliver per watt, per rack, and per dollar. Regulated AI adds another requirement: how much of that inference can run inside a trust boundary the customer can approve. The customer is not only buying throughput. They are buying usable throughput. That difference is what gives confidential compute a pricing and retention argument even if baseline AI compute becomes more competitive.</p><p>The second is what we call the double dollar, and it is the value equation in the full report. The premium is the smaller half of the money, because it applies to workloads that were coming to cloud anyway. The larger opportunity is conversion: regulated workloads that did not run at all, at any price, because nobody could approve the data exposure. In our illustrative base case, the conversion effect contributes roughly twice what the premium does, which means anyone modeling the premium alone is missing most of the lift. The full scenario math, conservative through high, sized per $100 billion of CSP AI compute revenue, is in the subscriber report. And the thesis survives even if the premium compresses, because the conversion and retention effects come from workloads clearing compliance rather than from a SKU markup. An investor does not need to believe in a permanent premium at the top of the observed range. They only need to believe confidential architecture unlocks production workloads that standard infrastructure could not capture.</p><p>The insight we would leave readers with is this: not all AI cloud revenue is equal. A dollar cleared by legal, compliance, and security should be more durable in a price war than a dollar of developer experimentation. Confidential mix is one of the few observable markers that lets stakeholders separate generic AI consumption from workloads with permission, auditability, and switching friction attached. The market is still modeling AI infrastructure through capacity. We think it should also be modeling permission. Protected tokens per watt is one way to describe that next layer.</p><h2>What the full report covers</h2><p>The subscriber report runs the full diligence framework, including:</p><ul><li><p><strong>The blocked-workload evidence base</strong> &#8212; insights from commentary from banking, healthcare, defense, and real estate on which AI workloads are stalled, what asset is at risk, and what unblocks them.</p></li><li><p><strong>Exhibit 2: the dollar-lift model</strong> &#8212; our per-$100B CSP revenue sensitivity framework across conservative, base, and high scenarios, with the full equation and the input the model is most sensitive to.</p></li><li><p><strong>Exhibit 3: the beneficiary map</strong> &#8212; five tiers from silicon to adjacent AI security, graded by directness, materiality today, and exactly what to track for each.</p></li><li><p><strong>Sovereign AI as a control problem</strong> &#8212; why local data centers alone fail the sovereignty test regulators are actually applying, and the regulatory calendar behind the demand.</p></li><li><p><strong>Private AI factories as governance infrastructure</strong> &#8212; the order-book evidence, agent tokenomics, and the attach economics that decide whether OEM AI revenue deserves better than a hardware multiple.</p></li><li><p><strong>Where the model changes</strong> &#8212; company by company: the CSPs, NVIDIA, Dell and HPE, AMD and Intel, and the confidential middleware layer.</p></li><li><p><strong>The honest valuation answer</strong> &#8212; whether any of this moves a stock today, where the lens is most material, and the forward checklist that tells you when the valuation question goes live.</p></li><li><p><strong>What to watch</strong> &#8212; the five verification items that graduate this thesis from working view to conviction.</p><p></p></li></ul>
      <p>
          <a href="https://www.thediligencestack.com/p/confidential-ai-turning-trust-into">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Agentic EDA and the Next Revenue Layer in Chip Design]]></title><description><![CDATA[From design seats to tapeout confidence, verification throughput, and higher revenue density]]></description><link>https://www.thediligencestack.com/p/agentic-eda-and-the-next-revenue</link><guid isPermaLink="false">https://www.thediligencestack.com/p/agentic-eda-and-the-next-revenue</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Tue, 09 Jun 2026 14:59:17 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/22ac5bef-b5fc-4926-9d94-c97f7a2bc34e_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>In recent weeks, we have had direct conversations with both Cadence and Synopsys, and those discussions, set against each company's latest earnings commentary, have firmed our view that agentic design tools add real TAM lift for both. Management at each now describes the mechanism we have been modeling: AI agents pull more work through the vendors' own simulation, verification, and implementation engines, and that work gets monetized on top of the existing subscription base. Both have pointed to a subscription plus consumption model for AI agents, and both are clear that most engagements remain in evaluation rather than full production. That combination, a confirmed direction with unsettled timing, is what this report works through.</em><br><br>EDA has been a quiet beneficiary of AI silicon complexity for years. Bigger chips, faster design cycles, and the spread of custom silicon have each raised the value of the software that carries a design team to tapeout with confidence, and that demand has already shown up in the results at Cadence and Synopsys. That tailwind is well understood and is not, on its own, the reason to revisit the category now. What is new is how the complexity gets paid for, because agentic AI raises a question the seat model never had to answer: whether the economics begin shifting from access toward throughput.</p><p>For most of its history, EDA has been valued as an access business. The model was built on seats: how many engineers need the tools, how firmly the workflows hold them, and how much advanced-node complexity lifts renewal value. Those drivers still hold. What has emerged underneath them is a different question, whether an agent stops being a convenience layered on top of the design flow and becomes a productive worker operating inside it. That is the shift that could pull the category off its historical pricing logic, because it changes what the customer pays for from the right to use the tools to the work the tools complete.</p><p>The lens that shapes EDA from here is verified design throughput. A customer in advanced silicon works against a shrinking window in which a fault can still be absorbed, and a fault found late costs far more than the same fault found early. A bug caught near tapeout converts directly into a respin and the schedule slip behind it, and in a competitive node race that slip becomes a roadmap problem before it becomes anything else. An agent that runs inside the design environment and calls the same verification and signoff engines that already gate tapeout does the thing that compounds for an incumbent. It pulls more billable work back into infrastructure the vendor already owns, and it does so at the stage of the flow where the customer is least willing to economize.</p><p>That changes how to read the concern that has hung over the group. Software investors have spent two years asking whether AI compresses seat-based pricing, and EDA keeps getting swept into that same trade. The pressure is real for most application software. It is the wrong read here, because EDA does not monetize the way SaaS does, and the constraint that actually binds sits elsewhere. What gates a design organization is its ability to bring a correct chip to tapeout as complexity and schedule pressure keep climbing, and that has little to do with how many engineers sit in front of a license. Once the work moves from manual iteration to autonomous tool usage, a smaller and more productive team can pull more compute and more verification cycles through the same tools, the reverse of the headcount-linked decline the seat-compression thesis assumes. The revenue question shifts from seats sold to throughput verified, and throughput is a consumption variable rather than a headcount one.</p><p>Verification is where the agentic case should prove out first, because it is the part of the flow where rising complexity turns into measurable schedule risk. Custom and analog design is the more differentiated secondary wedge, where the scarcity is institutional knowledge rather than digital complexity, and where native agents embedded inside established flows could monetize design history that has never been easy to encode. The two carry different proof burdens, and that difference is most of what separates the near-term call from the long-term one.</p><p>That same split separates the two companies. Cadence holds the cleaner near-term agentic case and should be able to prove it first, with its strength sitting closest to the verification flow where throughput turns visible soonest. Synopsys may own the larger long-term platform if the Ansys integration lands, though that path carries more execution risk and a longer runway to proof. Both can compound from here, and the evidence to underwrite each differs: for Cadence, verification throughput showing up in usage and renewal economics; for Synopsys, integration milestones that convert into design wins rather than roadmap claims.</p><p>The full report is where we size this and gauge it against what the market already pays. Our base case puts the opportunity at $2.5B to $3.0B of incremental annual core EDA revenue by 2030, inside a wider $1.5B to $5.0B range where customer acceptance of consumption pricing, rather than technical capability, is the swing variable. The report builds that revenue bridge step by step, lays out the Cadence versus Synopsys assessment map, works through how verification and custom design actually monetize, and runs the agentic dollar lift against each company&#8217;s current enterprise value to ask whether the opportunity is already priced in and where it defends or extends the multiple. It also sets the risks, from pricing resistance to China exposure, against the thesis, and names the contract-level signals worth watching before any of this shows up in reported numbers.</p><h3><strong>Inside the full report</strong></h3><ul><li><p>A full framework for why agentic EDA should be evaluated through verified design throughput rather than seat access.</p></li><li><p>Creative Strategies&#8217; revenue bridge estimating the potential incremental annual core EDA opportunity by 2030, including base case and sensitivity range.</p></li><li><p>Why verification is the first monetization proof point, and what to watch in regressions, emulation demand, cloud EDA usage, and module attach.</p></li><li><p>Why custom and analog design may be the more differentiated wedge as scarce expertise, proprietary IP history, and node migration become larger bottlenecks.</p></li><li><p>A Cadence vs. Synopsys underwriting map, including where Cadence may prove agentic attach first and where Synopsys may have a broader long-term silicon-to-systems opportunity after Ansys.</p></li><li><p>A valuation test that runs the agentic dollar lift against each company&#8217;s current enterprise value, using EV/Sales revenue bars to show what Cadence and Synopsys must earn to support today&#8217;s multiple, and where the lift defends or expands it.</p></li><li><p>The risks investors need to underwrite, including pricing pushback, hyperscaler internal tools, open-source pressure, China/export controls, and Synopsys integration execution.</p></li><li><p>The contract-level evidence that would confirm or disprove the thesis: renewal uplift, agentic SKU attach, usage budgets, production deployment in custom/analog, and margin durability.</p></li><li><p>The key conclusion: the market does not need to believe in fully autonomous chip design for EDA to deserve a different revenue lens. The real question is whether agents create more monetizable work inside workflows Cadence and Synopsys already control.</p></li></ul><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/agentic-eda-and-the-next-revenue">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The AI Cloud Stack: Where Hyperscalers and Neoclouds Actually Compete]]></title><link>https://www.thediligencestack.com/p/the-ai-cloud-stack-where-hyperscalers</link><guid isPermaLink="false">https://www.thediligencestack.com/p/the-ai-cloud-stack-where-hyperscalers</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 04 Jun 2026 16:42:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!JB7R!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2d1a2e4-fdff-44ec-ac8c-0ffe3ffa69f2_2400x1246.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;e2c58eae-e5db-4ac1-bb1f-f38a3a2f7889&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Neoclouds: The Backlog Quality Test&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-06-02T14:48:29.308Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/54a2a5c8-9627-4213-bfae-e4f3f7fc71e8_1376x768.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/neoclouds-the-backlog-quality-test&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:200199646,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:6,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>This report continues our work on neoclouds, although we are coming at the question from a different angle. The last report focused on backlog quality, monetizable MW, customer concentration, contract duration, financing structure, and the way hyperscaler demand was turning into external infrastructure commitments. That still feels like the right starting point. The next step is a full competitive SWOT, which is a useful exercise for separating near-term capacity leverage from the harder question of platform durability.</p><p>Our thesis remains that neoclouds are a direct proxy for hyperscaler urgency. The hyperscalers are still the demand center, but the neoclouds give us a useful read-through because their backlog and financing structures help quantify how that urgency is being translated into external AI infrastructure demand. We continue to think this is the cleanest way to study the space. A large hyperscaler contract, a GPU-backed debt facility, a power reservation, or a multi-year take-or-pay structure tells us something about the pressure inside the cloud market even before that pressure fully appears in reported cloud revenue.</p><p>The report also puts some limits around the neocloud narrative. CoreWeave, Nebius, and IREN are each trying to become more cloud-like, and we believe those efforts are rational. CoreWeave still has the strongest neocloud software and orchestration layer today. Nebius is pushing in both directions, down into owned infrastructure and up into inference and agent software. IREN starts with the clearest ownership of the physical bottleneck, while the Mirantis acquisition gives it a more credible path to build managed GPU services above that power base. All three are taking steps beyond GPU rental, which they need to do if they want the market to value them as more than capacity intermediaries.</p><p>We would still be careful about treating them as future hyperscalers. The big three cloud platforms have breadth that took years to build: identity, security, governance, data platforms, developer services, application integration, global operations, support, compliance, and procurement relationships. That breadth is what makes them difficult to displace across the long tail of enterprise workloads. The neoclouds can be very good GPU clouds and have a meaningful role in AI factory capacity, while still facing a much harder path in competing for the broader enterprise cloud control plane.</p><p>That distinction is the central point of the report. Neoclouds are well positioned where capacity speed, GPU availability, financing creativity, and power access are the binding constraints. Those are real advantages in this phase of the cycle. Training demand, burst capacity, and frontier model infrastructure can move toward the provider that can deliver the most usable compute at the right time and price. That is where the neoclouds have earned their relevance. Whether those are long term differentiators is an area we explore in this report.</p><p>Production inference creates a different test. Once AI workloads move into enterprise deployment, customers start to care more about reliability, identity, governance, data proximity, application workflows, security posture, support maturity, and cost per token. That favors providers with deeper platforms. It also raises the importance of custom silicon in a cost per token or all you can eat world. We know from our <a href="https://www.thediligencestack.com/p/the-eai-index-budget-architecture">CIO and CTO work</a> that token cost is becoming one of the central considerations for agentic AI spend. AWS, Google, and Microsoft all have more infrastructure and software depth to apply against that problem, and AWS and Google in particular have more mature custom silicon paths through Trainium, and TPUs.</p><p>That is why the scorecard in the full report separates stack presence from business quality. Azure leads our stack-presence score because Microsoft owns enterprise distribution, identity, M365, Dynamics, OpenAI access, and Copilot pull-through. AWS remains the infrastructure trust and custom silicon benchmark. GCP remains the data gravity and TPU economics specialist. Oracle sits in the middle because OCI has credible bare-metal GPU and RDMA capacity, plus a real database and enterprise applications estate, although its AI software middle is thinner than the big three.</p><p>The neoclouds split by scarce asset. CoreWeave is the backlog and orchestration case. The bull case is revenue visibility and software depth, while the diligence work is customer concentration, lease duration, financing cost, and whether inference becomes a larger part of the mix. Nebius is the most hyperscaler-like of the group, with large Microsoft and Meta commitments, a push toward self-owned infrastructure, and software assets like Token Factory and Tavily. The test is whether those pieces become repeatable consumption economics. IREN is the power-first case. Its advantage is physical and harder to replicate, but the multiple depends on whether managed AI cloud services can turn that power base into recurring revenue.</p><p>The broader conclusion to us is that AI cloud demand is segmenting by workload. Training can follow price and availability. Production inference should stay closer to platforms with identity, data, reliability, and cost-control advantages. Regulated enterprise workloads will put more weight on governance and support. AI-native startups may continue to value speed, GPU access, and modern tooling before procurement standardization starts to matter. Power-constrained capacity has its own logic because the scarce input starts before the GPU cluster is deployed.</p><p>The full-stack map does not give us one winner. It gives us a better way to analyze and value the next phase of the cycle. For hyperscalers, the issue is whether distribution, custom silicon, data gravity, and enterprise trust turn AI demand into durable consumption. For neoclouds, the issue is whether power, orchestration, and capacity access remain scarce enough to support repeatable economics as hyperscaler self-supply catches up. We still think the neoclouds have an important role in AI factories. We also think the path from GPU cloud to full cloud platform is a much harder climb than the current backlog numbers alone suggest.</p><h2><strong>What subscribers get in the full report</strong></h2><ul><li><p>A nine-layer full-stack framework for comparing cloud providers across power, data centers, custom silicon, GPU compute, networking, orchestration, model access, applications, and enterprise distribution.</p></li><li><p>A directional scorecard for AWS, Azure, GCP, Oracle OCI, CoreWeave, Nebius, and IREN.</p></li><li><p>Company-level SWOTs for each provider, with a specific diligence focus and read-through for every name.</p></li><li><p>A breakdown of why Azure leads through enterprise distribution, AWS through infrastructure trust and custom silicon, and GCP through data gravity and TPU economics.</p></li><li><p>A bridge-tier analysis of Oracle OCI and whether database and apps gravity can convert into AI workload attachment.</p></li><li><p>A neocloud comparison across CoreWeave, Nebius, and IREN, including backlog quality, power position, software depth, customer concentration, financing, and durability.</p></li><li><p>A workload map showing where training, production inference, regulated enterprise, data and analytics, AI-native startups, office agents, and power-constrained capacity are likely to route.</p></li><li><p>The central diligence question for the next phase: which AI cloud providers own a scarce layer that stays valuable after capacity becomes easier to procure.</p><p></p></li></ul>
      <p>
          <a href="https://www.thediligencestack.com/p/the-ai-cloud-stack-where-hyperscalers">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Neoclouds: The Backlog Quality Test]]></title><description><![CDATA[Why AI infrastructure durability now depends on monetizable MW, contract structure, and renewal discipline]]></description><link>https://www.thediligencestack.com/p/neoclouds-the-backlog-quality-test</link><guid isPermaLink="false">https://www.thediligencestack.com/p/neoclouds-the-backlog-quality-test</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Tue, 02 Jun 2026 14:48:29 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/54a2a5c8-9627-4213-bfae-e4f3f7fc71e8_1376x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Note: This report builds on our April framework separating the different business models emerging across neoclouds, GPU clouds, and power/infrastructure landlords. Later this week, we will publish a more technical SWOT across hyperscalers and neoclouds, focused on how competitive capabilities map to AI workloads, cloud architecture, and broader CSP trends.</em></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;d5e9b9c4-3e18-4a49-a448-c017fea72e88&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Neoclouds and the Three Business Models&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-04-14T16:06:43.791Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/258acade-5321-4ae9-9d74-625034f7df23_2752x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/neoclouds-and-the-three-business&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:193829282,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:8,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>We continue to watch the neocloud and AI infrastructure landlord landscape because these companies remain one of the better market proxies for hyperscaler urgency. The hyperscalers are the demand center, but the neoclouds, GPU clouds, and powered-shell landlords show us how that demand is being translated into contracts, financing structures, power commitments, and monetizable capacity. They also give us a cleaner way to track the dollar value being assigned to each constrained MW across the AI infrastructure stack.</p><p>There was a brief &#8220;thank goodness for the bitcoin miners&#8221; phase of this cycle. The hyperscalers were running into a simple problem: AI demand was moving faster than their ability to bring new power online. Bitcoin miners, for all the volatility of the legacy business model, had already done some of the hardest work. They had sites, grid relationships, substations, and access to large blocks of power. That made them useful as a time-to-power accelerant, and it helped explain why the first wave of investor enthusiasm focused so heavily on who had capacity ready to convert.</p><p>That phase rewarded access. Primary among them things like, capacity, power, sites, and deployment speed carried valuations because scarcity cleared the market. For a stretch of time, the size of a provider&#8217;s backlog was a reasonable proxy for the quality of its business. That proxy is becoming less useful. As more capacity comes online and more contracts are signed, the relevant question is moving away from the size of the commitment and toward the durable economics behind it. The input that constrains the industry is power, so backlog quality increasingly has to be measured against the economic value created per constrained MW.</p><p>That is what we mean by monetizable MW. The concept is simple enough: how much durable value does a contract produce per constrained MW, and who carries the power, financing, hardware, and renewal risk required to produce it? Headline backlog tells us demand exists. It says much less about how much of that demand turns into residual value after the asset is built, financed, depreciated, and renewed.</p><p>The spread is large enough to matter. Powered-shell landlords monetize roughly $1.2M to $2.3M of revenue per MW each year and generally transfer GPU ownership and obsolescence risk to the tenant. GPU clouds and full-stack neoclouds monetize several times more, often $7M to $13M per MW, while keeping more of the financing intensity, utilization risk, GPU depreciation, and renewal exposure that come with owning the compute. That is one of the more important takeaways from the work: the same constrained MW can support a roughly 5x revenue-density spread depending on where the company sits in the stack. Higher revenue density can be attractive, but it usually comes with more of the risk.</p><p>There is also a power-acquisition cost sitting beneath the entire discussion. Our research points to roughly $4M of generation capex per usable MW before transmission, distribution, shell, cooling, network, or GPU clusters are included. The exact figure moves by region, power source, interconnection status, and timing. The larger point is that the MW itself is expensive long before it earns anything.</p><p>That cost links the power debate directly to backlog quality. Every provider in this category is trying to turn constrained MW into durable revenue before one of three clocks runs out: the customer contract, the useful life of the asset, or the financing behind it. A model tracks well when revenue per MW, contract duration, and asset life stay aligned long enough to clear the capital stack. It gets harder when scarce power is tied to revenue that fades before the leases, GPUs, or debt are paid down.</p><p>We also think the distinction between firm MW and flexible MW will become more important. Some AI workloads can shift across time or geography in ways traditional enterprise workloads cannot. That creates room for demand response, interruptible operation, workload shifting, and dynamic token pricing. For operators that can run power-aware, flexibility becomes a potential operating advantage. Power stops being a passive input and becomes an asset that can be managed.</p><p>Applied across the major names, this does not produce one clean ranking. Power Landlords screen strongest on duration and hardware-risk transfer. IREN owns the scarcest asset in the cycle, deliverable power, and still has to prove a durable compute premium beyond its first hyperscaler-backed deployment. Nebius has the most interesting contract design among the platform names, with a disclosure gap that keeps its optionality from becoming a conclusion. CoreWeave has the strongest commercial validation in the group and the clearest mismatch between long data center leases, shorter customer contracts, and GPU depreciation.</p><p>The first expansion signals are encouraging. Several large customers have added capacity, extended commitments, or signed new multi-year anchor agreements, which supports the view that external AI infrastructure remains strategically useful while internal capacity catches up. We still separate expansion during scarcity from renewal discipline once customers have more choices. The current wave proves demand, urgency, and willingness to use external partners. The more important test is whether those customers renew at attractive pricing once internal capacity improves, custom silicon is more widely deployed, and prior-generation GPUs need a second economic life in inference.</p><p>One distinction runs through the whole analysis. Stronger contracts reduce business-model risk without necessarily improving the equity claim. A high-quality customer commitment can still sit inside a capital structure that leaves limited residual value for shareholders. That is why we separate the quality of the customer commitment from the quality of the shareholder claim throughout.</p><p>The full report turns this into a company-by-company assessment of CoreWeave, Nebius, IREN, and the powered-shell landlord cohort, including Core Scientific, Cipher, TeraWulf, Hut 8, and Galaxy&#8217;s Helios campus. We map who owns the risk across power delivery, GPU depreciation, utilization, refinancing, and renewal exposure, and we quantify the CoreWeave duration math, the Nebius disclosure gap, the landlord revenue-per-MW tradeoff, and the renewal-discipline signals we will monitor from here.</p><p>The goal is to make the diligence work more precise. AI infrastructure demand is visible, but the next layer of separation will come from how well each model aligns key variables like monetizable MW with customer duration, asset life, financing terms, and risk ownership. That alignment is the more important directional signal than the headline backlog number, and it is what we will be watching from here.</p><h1><strong>Inside the Full Report</strong></h1><p>The full report turns this framework into a company-by-company underwriting, backed by the model and a full set of exhibits.</p><blockquote><p>&#8226; The monetizable-MW framework in full, with revenue-per-MW and capex-per-MW ranges for landlords, GPU clouds, and full-stack neoclouds, and the UBS power-acquisition layer that sits beneath them.</p><p>&#8226; Company deep dives on CoreWeave, Nebius, IREN, and the powered-shell landlord cohort, including Core Scientific, Cipher, TeraWulf, Hut 8, and Galaxy&#8217;s Helios campus, each underwritten on duration, risk ownership, and renewal exposure.</p><p>&#8226; The CoreWeave duration math, including why revenue per MW reads near $18.5M on active power but compresses to about $5.3M across contracted power, and what that swing means for the renewal case.</p><p>&#8226; The Nebius disclosure gap quantified, with company-level ARR implying roughly $0.5M per MW against an estimated $16M per MW on the Microsoft hosting agreement.</p><p>&#8226; A side-by-side map of who owns the risk across full-stack platforms, bare-metal GPU clouds, and Power Landlords, layer by layer from power delivery to refinancing.</p><p>&#8226; The firm-versus-flexible MW distinction as a new operating advantage, and how power-aware operation opens a second revenue line.</p><p>&#8226; A risk scorecard ranking the models on power control, contract duration, hardware-risk transfer, financing support, disclosure quality, and customer concentration.</p><p>&#8226; The hyperscaler paradox, and what the next renewal wave will reveal about true dependence as internal capacity and custom silicon arrive.</p><p>&#8226; A renewal-discipline tracker naming the specific contract signals we will monitor, from prepayments and step-down pricing to GPU residual values and project-finance spreads.</p><p>&#8226; Where we could be wrong, and the data caveats behind the model.</p></blockquote><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/neoclouds-the-backlog-quality-test">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Memory in the Age of Inference]]></title><description><![CDATA[How Concurrent User And Agent Sessions Turn Memory Into A System Architecture Problem]]></description><link>https://www.thediligencestack.com/p/memory-in-the-age-of-inference</link><guid isPermaLink="false">https://www.thediligencestack.com/p/memory-in-the-age-of-inference</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 28 May 2026 15:24:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!IZyo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fb92412-f909-4d10-a662-f8804c08fd29_1800x618.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;eedf1829-5c5f-46e8-8f39-e5a99903a729&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Memory's $200B Inflection&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-02-19T17:15:37.190Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!rSQ7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d87bc96-3327-41e2-955f-511bac8f0f89_1244x890.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/memorys-200b-inflection&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:187534208,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:14,&quot;comment_count&quot;:1,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>A lot has happened since our first anchor report on memory back in February. We argued that the first phase of the AI memory story was about repricing coupled with the markets need to understand how the demand cycle has changed with AI compute. As we anticipated, HBM became scarce, conventional DRAM tightened, NAND began to benefit from AI storage demand, and memory moved from a background input in the server bill of materials to one of the more visible constraints in the AI infrastructure stack. That repricing is still a driver because it gave the market a clear signal that memory had become large enough to affect AI infrastructure economics. The next question is more durable: what does memory normalize into as inference becomes a larger share of AI workloads?</p><p>We think the answer depends on how inference scales. Early inference could often be understood as a prompt-response workload. A user asks a question, the system generates an answer, and the economics can be framed mostly around cost per token. That framing remains useful, but it becomes incomplete as AI usage shifts toward many concurrent user and agent sessions. The system has to keep those sessions useful while work is happening. It has to preserve context, maintain state, retrieve information, manage tool calls, and carry a workflow forward across multiple steps. The user may only see a short answer or a completed task, but underneath the system is holding much more live state than the interface suggests. Thinking models and reasoning models have fundamentally changed the demand for inference compute and the entire inference cell infrastructure. </p><p>At scale, inference specific compute is bound by how many concurrent users/agents it can handle at a single time. That is the central memory/storage shift taking place. Inference memory demand should be modeled around how many live sessions the system has to support, and where the state for those sessions resides. Longer context pulls on accelerator memory and host memory because more information has to stay close to execution. KV cache turns concurrency into a capacity problem because every active session carries memory with it. Agentic workflows extend the problem further because the system has to remember what the agent is doing while it is doing it. As these workflows become more persistent, the memory question becomes less about a static bill of materials and more about the operating capacity of the AI system. In <em>Storage Shock</em>, we framed the storage version of this problem as the need to keep more data warm enough to be called and used. The memory version is that more inference state has to stay live enough for the system to continue the work. Context is part of that state, but the larger issue is continuity: the system has to preserve enough of the session&#8217;s active memory so the next step can happen without rebuilding the workflow from scratch.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;bba5d287-4933-4fcd-8979-5e2a377d7d70&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Agentic AI Storage Shock&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-21T15:27:52.131Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dEqI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66aeade-08be-401e-a4f4-e4b8da29998d_1800x1050.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/the-agentic-ai-storage-shock&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:198594825,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:17,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>This is why memory increasingly needs to be viewed as a hierarchy rather than a single component category. HBM remains the clearest near-term constraint because it is tied directly to accelerator roadmaps and high-bandwidth execution. Server DRAM becomes more important as <a href="https://www.thediligencestack.com/p/secret-agent-cpu-revisited">AI head-node and CPU-side execution demand rise</a>. NAND and enterprise SSDs become more relevant when some state can tolerate additional latency and move closer to the serving path. Other layers remain earlier in their proof cycle, but they are useful signals for where the architecture may go if agentic infrastructure becomes more repeatable.</p><p>For stakeholders, the important point is that memory intensity can rise with utilization on top of along with server shipments. <strong>A fleet can become more memory constrained when the same installed base is serving more live sessions for longer periods of time.</strong> That means the memory model has to account for how long sessions remain active, how much state they retain, and how efficiently the system can move that state across the hierarchy. The central question is whether enough of the session can stay close to compute to preserve performance, while lower-cost tiers absorb the state that does not need to remain in the hottest layer..</p><p>This also changes how we should think about cyclicality. Commodity memory will still cycle, and the familiar variables of things like pricing, supply, inventory, and customer digestion will continue to drive revisions in the segments of the industry that will still abide by that dynamic. However, the more salient point is that AI-attached memory can behave differently when it is tied to qualification, roadmap certainty (LTAs), and system output. In AI infrastructure, memory decisions are moving earlier into system design because they affect how much useful inference work the platform can support.</p><p>Standards remain essential because they make the ecosystem buildable, but the interface increasingly becomes the floor rather than the source of differentiation. The value shifts toward suppliers whose memory fits the system roadmap, performs within the power and thermal envelope, exposes enough usable capacity to software, and helps the platform support more concurrent inference work. That is why memory should be forecast by where inference state lives. Accelerator-attached memory follows accelerator and ASIC roadmaps, server DRAM follows AI head-node and CPU execution demand, and NAND or enterprise SSDs become more relevant when context and persistent state can tolerate more latency. The memory cycle will still cycle, but a cycle-only lens is too blunt once inference creates a hierarchy of memory constraints that shape operating capacity and value capture.</p><h2>What&#8217;s Inside the Full Report</h2><ul><li><p>A full framework for moving from the memory pricing shock to the next question: whether inference scale changes the structural role of memory after pricing cools.</p></li><li><p>A detailed explanation of why concurrent user and agent sessions turn memory into a live-state capacity problem.</p></li><li><p>A breakdown of where inference state lives across HBM, server DRAM, SOCAMM/LPDDR, NAND/eSSD, and CXL.</p></li><li><p>A model bridge for forecasting memory demand by tier rather than using one blended memory line.</p></li><li><p>A deeper discussion of KV cache, context length, active state, and why memory pressure can rise before accelerator unit growth fully explains the move.</p></li><li><p>A framework for why CPU core count may become a memory-attach signal in agentic inference systems.</p></li><li><p>Scenario logic for translating CPU-side memory assumptions into long-term DRAM demand sensitivities.</p></li><li><p>A beneficiary map separating current evidence from later-cycle opportunities across memory suppliers, SSDs, controllers, interface silicon, rack integration, and CXL.</p></li><li><p>A disciplined view of what the thesis does and does not require, including why commodity memory can still cycle while AI-attached memory behaves differently.</p></li><li><p>A monitoring dashboard for what would confirm, weaken, or force a rethink of the thesis.</p></li></ul><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/memory-in-the-age-of-inference">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Secret Agent CPU, Revisited]]></title><description><![CDATA[From Architecture Call to Market Model]]></description><link>https://www.thediligencestack.com/p/secret-agent-cpu-revisited</link><guid isPermaLink="false">https://www.thediligencestack.com/p/secret-agent-cpu-revisited</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Tue, 26 May 2026 15:10:32 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!W61N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b626cef-adc5-4bdf-ba38-165ad8ecf5b8_1824x1221.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;0cdac70c-845e-4014-a36e-0098bebf8b7f&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Secret Agent CPU&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-03-24T18:32:21.001Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/111187e0-6082-4353-bd19-06bb492be439_2816x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/secret-agent-cpu&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:191156925,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:15,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>When we first published Secret Agent CPU in March, the goal was to understand how inference changes when the workload moves from answering a prompt to completing a workflow and build a market sizing model for how much agentic use cases could increase the TAM for server CPUs. Our research identified, in traditional inference, the GPU carries the model math and the CPU can sit closer to a host-control role. Agentic inference creates a different operating problem but also a new opportunity for new infrastructure. The system has to keep the workflow alive while the model waits on tools, preserves state, checks permissions, and interacts with external systems. The GPU remains essential,<strong> but the return on that GPU increasingly depends on whether the surrounding infrastructure can keep the agent loop moving.</strong></p><p>That was the architectural point in the original report. The CPU becomes more important because agentic inference pushes more work into the execution layer around the model. This is where the old AI infrastructure framing starts to break down. A prompt-response system can be measured mostly through accelerator throughput. A production agent has to be measured through the whole inference cell: accelerator racks that generate tokens, CPU execution racks that keep environments alive, and the memory fabric that keeps state close enough to use.</p><p>We emphasize this distinction because demand per user changes as AI moves from chat into work. A lightweight assistant may only need a narrow CPU-side footprint while the model responds. A more capable agent can fan out into multiple live environments as it does a host of CPU specific tasks like retrieves data, runs code, calls tools, verifies outputs, and waits on systems of record. In that world, infrastructure limits are measured by tokens per second and by the number of simultaneous agentic execution environments each active user can sustain. </p><p>The new development is that this orchestration layer is becoming more visible as a product category. Recent market sizing work has started to separate server CPU demand into more useful buckets, but the more important signal is the emergence of dedicated CPU orchestration racks as part of the inference architecture. These racks are likely to sit in line with GPU and XPU racks as part of the complete agentic inference system. That moves the thesis beyond a simple head-node attach story. Head-node CPUs still matter inside accelerator systems, but our original Agentic CPU thesis was always closer to a dedicated orchestration-layer thesis.</p><p>That shift is what changed our model. If the CPU role is limited to head-node content inside accelerator systems, the opportunity is real but more bounded. If dedicated CPU racks become part of the production inference path, the server CPU TAM expands differently because the CPU becomes part of the execution fabric around the model. In the full report, we raise our 2030 server CPU TAM base-case scenario by roughly 25%-30% versus the original framework, while keeping the higher cases tied to clearer evidence that dedicated orchestration racks become a repeatable hyperscaler procurement pattern.</p><p>We are still keeping range discipline and if anything, still modeling this conservatively. The evidence supports a higher base case, while the most aggressive outcomes still require more production proof. Recent public commentary has put a large CPU revenue marker into the market, including standalone CPU servers and CPU content inside larger AI systems. The exact mix still needs cleaner disclosure, but the direction supports the notion that dedicated CPU capacity is becoming part of the inference system rather than a distant bull-case abstraction.</p><p>The second layer is memory. Once dedicated CPU racks become part of the agentic inference architecture, CPU-attached memory intensity becomes one of the best signals to monitor. SOCAMM and LPDDR-based server memory are not literally HBM for CPUs. The technologies differ in packaging, bandwidth, economics, and supply chain. The strategic analogy is better stated as: HBM keeps accelerator math engines fed, while dense CPU-attached memory may become part of the fabric that keeps agentic CPU racks productive. As more state sits near the CPU tier, the market has to underwrite memory capacity, bandwidth, and power efficiency as part of the CPU story.</p><h4>The full report includes:</h4><ul><li><p>Our revised Creative Strategies server CPU TAM framework and the uplift from the original *Secret Agent CPU* model.</p></li><li><p>A gross-versus-net demand model that separates existing server CPU demand from new agent-native orchestration demand.</p></li><li><p>A revised architecture view that places dedicated inline CPU orchestration racks closer to the center of the original thesis.</p></li><li><p>A hyperscaler and neocloud framework for understanding who absorbs early CPU demand and who may need to fund more surrounding infrastructure.</p></li><li><p>A beneficiary map that extends the read-through beyond CPUs into memory, storage, networking, rack integration, power, cooling, and client execution.</p></li><li><p>A scenario-based company model for Intel, AMD, and ARM/custom silicon, focused on dollar growth and share shift rather than stock recommendations.</p></li><li><p>A CPU-attached memory model that explains why agentic orchestration may create a second-order DRAM and SOCAMM opportunity.</p></li><li><p>Sensitivity work for CPU attach, AI CPU ASPs, memory per orchestration CPU, and agentic infrastructure intensity per user.</p></li><li><p>An inference-cell framework showing why served-user capacity depends on both GPU token throughput and CPU-side execution environments.</p></li><li><p>An evidence scorecard separating what is proven, what is directionally supported, and what still needs deployment proof.</p></li><li><p>A monitoring dashboard for dedicated rack orders, CPU attach per accelerator, memory bandwidth per CPU, and SOCAMM adoption.</p></li></ul>
      <p>
          <a href="https://www.thediligencestack.com/p/secret-agent-cpu-revisited">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Agentic AI Storage Shock]]></title><description><![CDATA[How enterprise agents turn data lakes, workflow logs, and generated artifacts into the next infrastructure gating layer]]></description><link>https://www.thediligencestack.com/p/the-agentic-ai-storage-shock</link><guid isPermaLink="false">https://www.thediligencestack.com/p/the-agentic-ai-storage-shock</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 21 May 2026 15:27:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!dEqI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa66aeade-08be-401e-a4f4-e4b8da29998d_1800x1050.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We spent the early part of the week at Dell Tech World, and a specific takeaway from conversations with Dell executives, customers, and practitioners was that the enterprise data layer is moving back to the center of the AI discussion. Agentic AI makes that inevitable once you grasp how the workflow is changing. Claude Code/Cowork and Codex are useful early indicators because they show what happens when AI moves beyond a cloud chat interface and gains read/write access to a working file system. Once the model can operate inside the project environment, the workflow leaves behind more than a final output. It creates a durable record of how the work was produced, what context informed it, and what changed before the result was accepted.</p><p>That is the same shift enterprises are preparing for across broader workflows. Agents will need more than model access or an application interface. They need enterprise context that can be retrieved, permissioned, verified, acted on, written back against, and retained. That requirement puts pressure on the entire data stack, from storage and retrieval to governance, observability, and the software layer that controls access to the data.</p><p>A useful way to understand the shift is the movement from cold data to warm data. Cold data is stored and retained and archived, or technically accessible, but not necessarily prepared for live use inside a workflow. It may sit in documents, old ticket histories, file shares, application databases, compliance archives, email threads, or data lakes. A human user can often work around that fragmentation by searching, interpreting, asking someone, or applying judgment. An agent cannot reliably do that unless the context is represented in a way the system can retrieve, permission, verify, and act on.</p><p>Agentic AI pushes more of that data into a warm operating layer. Warm data is not necessarily hot transactional data, but it is close enough to the work to be useful. It is indexed, governed, permission-aware, observable, and current enough to support an action. A support agent needs prior tickets, product documentation, entitlement rules, customer history, and an audit path. A procurement agent needs supplier terms, shipment data, approval thresholds, exception history, and writeback into the system of record. A coding agent needs the repository, dependencies, test logs, build artifacts, and security scans. In each case, the agent is not simply searching information. It is using enterprise context to complete work.</p><p>That observation changes how we should think about storage and the software layer above it. In an agentic enterprise, data has to be stored economically, protected, and recovered, while also becoming available for retrieval, reuse, governance, and audit. More workflow history becomes operating context. More access paths require permissioning. More agent actions create records that need to be logged for compliance, quality control, and future use. A growing share of the storage layer therefore participates directly in execution.</p><p>This is why we think the next phase of enterprise AI is best understood as a data architecture transition. The agent interface will get attention because it is what users see. The durable operating change sits underneath: trusted retrieval, identity, permissions, approvals, tool calls, writeback, observability, and retention. The outcome here is: agents need enterprise context to act, but every action creates new enterprise context. If that record is retained, governed, and made retrievable, the output of one workflow becomes an input into the next.</p><p><strong>The uncomfortable version of this thesis is that if agentic AI works, the enterprise storage stack is underbuilt.</strong> Agents consume context, generate workflow records, create artifacts, and turn prior work into future operating memory. That means the data layer has to support both sides of the agentic loop: fast access to warm context during execution and durable retention of the new data created by the workflow. We explored the hardware side of this in our earlier report, Storage Wars, where we argued that flash is moving from a persistence layer into an active extension of the inference memory hierarchy. The agentic enterprise extends that logic from the AI rack into the enterprise data estate.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;dbcb2d97-e947-406f-bc23-a1c014fe7902&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Storage Wars: When Memory and Storage Collapse Into One Layer&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-04-07T15:07:47.695Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a7337085-4659-43b1-89ba-502fab16c373_2752x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/storage-wars-when-memory-and-storage&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:193076442,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:7,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>That loop gives stakeholders a more useful way to track the theme. Agentic AI spend is unlikely to appear as one clean budget line. It will show up through services work, workflow modules, governance products, observability and security tools, data platform modernization, and infrastructure refresh. The best positioned companies are likely to be those closest to the work, closest to the data, and closest to the retained record of what the agent actually did.</p><h3>For paid subscribers, the full report includes:</h3><ul><li><p>Our full framework for the Agentic AI Data Flywheel and why enterprise data shifts from passive information to active operating memory.</p></li><li><p>A breakdown of the enterprise agent execution stack, including workflow ownership, systems of record, identity, permissions, retrieval, observability, writeback, and retention.</p></li><li><p>A deployment evidence map showing where agentic AI is moving first across ITSM, support, procurement, payroll, coding, underwriting, legal, and regulated workflows.</p></li><li><p>A framework for the three spend pools: software control plane, data readiness, and memory/storage infrastructure.</p></li><li><p>A detailed view of the software control points and which categories of vendors are best positioned.</p></li><li><p>Beneficiary map of what vendors across the stack are poised to benefit from this dynamic</p></li><li><p>A storage and memory stack analysis covering HBM, DRAM, SOCAMM, enterprise SSDs, QLC NAND, object storage, nearline HDD, archive, controllers, and networking.</p></li><li><p>A directional sensitivity model for how much data different agent workflows may create, from text research and decks to images, video, code, legal workflows, and engineering simulations.</p></li><li><p>Our bull and bear case for the thesis, including the key indicators we will track to see whether agents are moving from assisted productivity into controlled execution.</p></li></ul><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/the-agentic-ai-storage-shock">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Tokenomics and the Fixed-Cost Economics of AI Factories]]></title><description><![CDATA[A companion model to The Inference Payback that turns revenue per MW into rack-hour math: committed GPU capacity, paid tokens, and the falsifiers that would break the factory economics.]]></description><link>https://www.thediligencestack.com/p/tokenomics-and-the-fixed-cost-economics</link><guid isPermaLink="false">https://www.thediligencestack.com/p/tokenomics-and-the-fixed-cost-economics</guid><dc:creator><![CDATA[Max Weinbach]]></dc:creator><pubDate>Tue, 19 May 2026 17:05:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!29WA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14970767-e638-4512-9242-d2c30216b3a6_2200x720.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>From Inference Payback to Rack-Hour Economics</h2><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;2177b655-a218-43d7-a9af-d45eeb4db033&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Inference Payback&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-14T15:43:14.807Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!y_Ni!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fded3147b-8c6e-4f18-9d97-82355565536b_1600x950.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/the-inference-payback&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:197557765,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:5,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Our last report, <em>The Inference Payback</em>, framed the next AI infrastructure question at the factory level. Token growth is useful evidence of demand, but it does not tell us whether AI factories can earn their cost of capital. The more important test is profitable demand density: how much external, paid, high-margin inference a deployed megawatt can support after power, leases, depreciation, financing, utilization, and refresh cycles.</p><p>This report moves one layer lower in the same economic question. The prior report looked at whether inference can fund the broader AI buildout. This report looks at the production mechanics underneath that payback case. The unit of analysis is the rack-hour: how many sellable tokens a fixed AI factory can produce per hour, at what realized price, and with how much of annual capacity allocated to inference rather than model development, internal research, idle reserve, or improvement loops.</p><p>That is the reason for the model. Revenue per megawatt and gross profit per megawatt are the board-level payback metrics. Paid tokens per rack-hour are the operating mechanism underneath them. A factory can look large in GPUs and megawatts while still underperforming economically if too much capacity is absorbed by work that does not monetize well or does not monetize at all.</p><p>When an AI lab rents a large GPU cluster on an annualized basis, that capacity starts to behave like a fixed-cost production asset. The company has committed to a cost base. The operating question becomes how many revenue-generating tokens that factory can produce per hour, per rack, and per year.</p><p>This is the context behind Jensen Huang&#8217;s &#8220;AI factory&#8221; language. A modern AI data center is a production asset. Training, post-training, research, internal evaluation, and inference all compete for the same factory capacity. Inference is the clearest path to direct revenue generation, which makes rack-hour allocation one of the central economic variables in the model.</p><p>The purpose of this report is to give readers a way to translate AI infrastructure scale into token economics. GPU count, megawatts, rack architecture, GPU-hour pricing, throughput, utilization, and realized token pricing are all relevant, but they only become useful when they are connected in one model. The rack-hour framework lets us ask how much paid inference capacity a factory can produce and how much fixed cost that capacity has to absorb.</p><p>OpenAI and Stargate are useful reference cases because more public information and estimates exist around their scale, power requirements, architecture, and possible rental economics. The framework is broader than OpenAI. The same logic applies to any company renting or operating large-scale AI compute, including frontier labs, hyperscalers, and neocloud platforms. The specific inputs will vary by architecture, contract, workload mix, pricing model, and utilization, but the economic question is the same: how many paid tokens can the deployed infrastructure produce against its fixed cost base?</p><p>One scope note. This report is focused on the AI lab or compute customer side of the transaction. We are asking how a company that has committed to a GPU-hour bill converts that fixed cost into paid inference. The provider-side model is different. Oracle, CoreWeave, Crusoe, Nebius, hyperscalers, and other neoclouds have their own questions around capex recovery, power pricing, financing, ROIC, customer concentration, and residual value. Those economics matter, but they sit one layer away from this model.</p><p>The output should be read as a capacity framework, not a revenue forecast. We are not trying to identify a single Stargate (like) price or produce a definitive operating model for one facility. We are giving readers a way to underwrite the payback logic of any large AI factory: start with the fixed compute commitment, translate it into rack-hour burden, estimate sustained token throughput, apply realized token pricing, and then test how much of the annual factory can be kept in revenue-generating inference.</p><p>That is the central question for the inference era. Competitive advantage will accrue to the companies that convert deployed infrastructure into paid token volume most efficiently, with utilization, model architecture, serving software, and pricing power all compounding into better factory economics.</p><h3>In the full report, paid subscribers get:</h3><ul><li><p>A rack-hour companion model to The Inference Payback that translates revenue-per-MW and GP-per-MW into token throughput, realized pricing, and fixed-factory cost burden.</p></li><li><p>A Stargate Abilene reference case using roughly 100K GB200 GPUs, 1,389-1,400 NVL72 racks, and three GPU-hour rental scenarios.</p></li><li><p>A pricing-mix bridge from GPT-5.5-style list prices to a blended $6.67 per 1M total tokens, with the caveat that list price is not realized revenue.</p></li><li><p>Throughput sensitivities for GB200 and GB300 racks, including active-parameter, decode, batch-size, and serving-efficiency assumptions.</p></li><li><p>An annualized factory model that separates theoretical rack-hours from inference rack-hours, training, research, idle reserve, and internal improvement loops.</p></li><li><p>A fixed-cost-per-token sensitivity showing when the factory burden remains manageable and when throughput, pricing, or allocation break the spread.</p></li><li><p>A scope boundary between the AI lab/customer model and the neocloud or hyperscaler owner model, so the two payback questions do not get blended together.</p></li><li><p>A monitoring framework for realized token pricing, sustained rack throughput, inference allocation, demand absorption, serving-stack efficiency, and the thesis breakers that would make the model fail.</p></li></ul>
      <p>
          <a href="https://www.thediligencestack.com/p/tokenomics-and-the-fixed-cost-economics">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Inference Payback]]></title><description><![CDATA[What Has to Be True for AI Factories to Fund Themselves]]></description><link>https://www.thediligencestack.com/p/the-inference-payback</link><guid isPermaLink="false">https://www.thediligencestack.com/p/the-inference-payback</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 14 May 2026 15:43:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!y_Ni!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fded3147b-8c6e-4f18-9d97-82355565536b_1600x950.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;3ba9a6e4-0444-4d4a-af71-074448a5d445&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;From AI Usage to AI Earnings Power&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-05T14:50:20.642Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c391c28-b04b-4830-b0d4-63fd555a0372_2816x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/from-ai-usage-to-ai-earnings-power&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:196440058,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:6,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;2b64cc1f-ca68-4a38-bd59-ba52bbc5ef37&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;From Model Wars to Platform Wars&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-07T14:57:20.077Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!KJrr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e2b1e42-167f-4c60-b251-b3b58216bf91_2200x1362.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/from-model-wars-to-platform-wars&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:194572162,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:3,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;3ded0d95-2557-46c2-8ea7-8811424e928b&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The E/AI Index: Budget Architecture and the Next Phase of Enterprise AI Adoption&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-12T15:15:42.845Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dvsA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.thediligencestack.com/p/the-eai-index-budget-architecture&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:196476560,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:2,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!dHRL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb5674cd-1c60-409d-9f93-fb9ff7065932_1254x1254.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Our last several reports worked through the customer side of AI monetization. The AI ROI report looked at where AI creates measurable workflow value. <em>Platform Wars</em> asked which layer captures the economics once those workflows move into production. The E/AI Index moved into budget architecture, showing how CIOs and CTOs are funding AI through new budgets, reallocation, and deployment maturity.</p><p>This report moves to the infrastructure side of the same cycle. If customers are finding ROI, platforms are competing to own the workflow layer, and CIOs are forming AI budgets, the next question is whether the factories underneath that demand can earn their cost of capital. The buildout now has to show that consumption can carry compute, power, leases, depreciation, financing, and refresh cycles.</p><p>We remain constructive on AI demand. Models are improving, user bases are expanding, enterprise experimentation is deeper than it was a year ago, and the largest platforms continue to commit capital at a scale that only makes sense if they believe usage keeps compounding. Demand growth, on its own, is no longer the hard part of the argument. The harder part is understanding what kind of demand it is, who pays for it, and whether it produces enough gross profit per MW to carry the infrastructure behind it.</p><p>That is why we keep coming back to profitable demand density per MW. Training capital was easier to justify when the main objective was capability: better models, scarce compute access, and position on the capability curve. Inference has a different burden. It is where products expose demand, customers test pricing, and fixed infrastructure cost has to be absorbed by workloads that either generate revenue directly or create enough strategic value to justify the capacity they consume.</p><p>Token growth can come from very different places, and those differences are where the economics get messy. A paid API call tied to an enterprise workflow is not the same as a free consumer query, a discounted batch job, an internal eval run, or a long chain of hidden reasoning tokens that never appears directly to the user. Each consumes compute. Only some produce revenue with enough pricing power and margin to help pay for the factory behind it. That is the mix problem. Token volume can look healthy while realized pricing, serving cost, utilization quality, or revenue traceability all move in the wrong direction.</p><p>Enterprise spend is an important part of the answer, and it should be one of the cleaner early proof points. The CIO budget work gives us more confidence that real enterprise AI budgets are forming. We would still be careful about treating those budgets as enough to fund the full hyperscaler AI buildout on their own. The infrastructure case needs several things to work at the same time: paid enterprise workflows, consumer monetization, internal product lift, platform services, high utilization, lower cost per token, sustained premium demand, and disciplined capital deployment. Enterprise AI TAM is too narrow a denominator for that broader payback question. The better unit of analysis is profitable AI revenue and strategic value per MW.</p><p>For our diligence work, this moves the tracking burden lower in the stack. Capex, product adoption, and revenue growth are useful, but they are still too high level to answer the payback question. The harder work is estimating realized pricing, workload mix, utilization, depreciation pressure, lease exposure, and the rate at which hardware and software improvements reduce cost per token. A high-usage world can still produce weak payback if too much of the demand is free, discounted, internal, or expensive to serve.</p><p>Our subscriber report builds the model behind that view. It separates token growth from profitable inference demand, uses revenue per MW and gross profit per MW as the operating lens, and tests what has to be true across hyperscalers, frontier labs, neoclouds, and private AI factories. <strong>The goal is to identify which operators have enough control over power, silicon, pricing, distribution, utilization, and balance sheet risk to make the math work.</strong></p><h3><strong>In the full report, paid subscribers get:</strong></h3><ul><li><p>A payback model for separating token growth from profitable inference demand.</p></li><li><p>The token demand funnel: total demand, monetizable demand, and profitable demand.</p></li><li><p>A revenue-per-MW lens for comparing hyperscalers, frontier labs, neoclouds, and private AI factories.</p></li><li><p>Scenario work showing what has to be true for inference to fund the infrastructure buildout.</p></li><li><p>A treatment of why enterprise AI budgets are only one part of the payback stack.</p></li><li><p>A breakdown of why internal inference can create strategic value and weaken clean revenue-per-MW math.</p></li><li><p>A view on which operators have structural advantages through power, silicon, distribution, pricing, and balance sheet capacity.</p></li><li><p>A monitorable set of signals for pricing, utilization, capex/MW, GP/MW, depreciation, lease exposure, refinancing risk, and the next 12 to 24 months of evidence.</p><p></p></li></ul>
      <p>
          <a href="https://www.thediligencestack.com/p/the-inference-payback">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The E/AI Index: Budget Architecture and the Next Phase of Enterprise AI Adoption]]></title><description><![CDATA[How CIOs and CTOs are funding AI, where deployment is moving into production, and which vendor categories are exposed as AI shifts from experimentation to budget reallocation]]></description><link>https://www.thediligencestack.com/p/the-eai-index-budget-architecture</link><guid isPermaLink="false">https://www.thediligencestack.com/p/the-eai-index-budget-architecture</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Tue, 12 May 2026 15:15:42 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!dvsA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;e893546f-e87b-4592-a968-49e50ec65bee&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;From AI Usage to AI Earnings Power&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-05T14:50:20.642Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c391c28-b04b-4830-b0d4-63fd555a0372_2816x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/from-ai-usage-to-ai-earnings-power&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:196440058,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:6,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;40e4672f-0189-43eb-8e4e-df417a8e36e0&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;From Model Wars to Platform Wars&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-07T14:57:20.077Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!KJrr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e2b1e42-167f-4c60-b251-b3b58216bf91_2200x1362.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/from-model-wars-to-platform-wars&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:194572162,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:3,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p><br>We have been working through the enterprise AI monetization question in the order above because the market still mistakenly compresses several different issues into one broad adoption narrative. The first report in this series focused on customer ROI and argued that AI becomes economically meaningful when it changes the cost, speed, staffing, quality, or capacity attached to a repeated unit of work. The second report moved from ROI proof to platform control and asked which layer captures the economics once those workflows move from pilot to production. This report takes the next step into budget architecture, where the AI cycle becomes easier to analyze because CIOs are now deciding which existing spend pools should fund the next wave of deployment.</p><p>An observation worth highlighting, that comes on the back of our 25+ years studying this industry, is that enterprise AI adoption will not move through the market as one uniform curve. Consumer technology has always had different adoption profiles, with early adopters, the early majority, late majority buyers, and laggards all behaving differently as a market matures. Enterprise technology follows a similar pattern, although the behaviors are shaped by budget ownership, governance, security posture, regulatory burden, data readiness, organizational complexity, and tolerance for operational risk. The most progressive enterprises are important use cases because they give us leading indicators of what may become possible. They also represent a minority of the market, and their behavior should not be treated as the steady-state adoption pattern for the long tail of enterprise buyers.</p><p>That is why we are committed to tracking the adoption path in real time rather than drawing broad conclusions from the most aggressive early adopters alone. Early enterprises can move faster, accept more deployment complexity, tolerate higher model or integration cost, and make bigger organizational changes because they often have stronger technical teams, cleaner data architecture, more executive sponsorship, or a clearer strategic mandate. The broader enterprise market behaves differently. The majority of companies move through procurement, governance, legal review, security architecture, workflow integration, and CFO scrutiny at a slower pace. That does not make the cycle less real, but it changes how we should think about timing, value capture, and the durability of current usage patterns.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dvsA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dvsA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png 424w, https://substackcdn.com/image/fetch/$s_!dvsA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png 848w, https://substackcdn.com/image/fetch/$s_!dvsA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png 1272w, https://substackcdn.com/image/fetch/$s_!dvsA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dvsA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png" width="636" height="380" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:380,&quot;width&quot;:636,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:84551,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thediligencestack.com/i/196476560?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dvsA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png 424w, https://substackcdn.com/image/fetch/$s_!dvsA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png 848w, https://substackcdn.com/image/fetch/$s_!dvsA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png 1272w, https://substackcdn.com/image/fetch/$s_!dvsA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd1700cc0-9edd-447c-80a6-c53ee7ad6aa7_636x380.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Our CIO E/AI Index work helps locate where enterprise AI sits on the adoption curve. AI has clearly moved beyond experimentation, but it remains early in terms of full-scale operating change. In the panel, 67% of respondents now have AI running in production functions, 69% have a dedicated AI budget, and AI represents roughly 9% of IT spend on average, with budgets expected to grow again in the next planning cycle. That supports the broader AI spending cycle, but the more important finding is that enterprise behavior is becoming segmented. Progressive adopters are moving faster, funding AI more deliberately, and beginning to separate what deserves scale from what stays in pilot.</p><p>The funding mix is where the budget story becomes more useful. Roughly 28 cents of the incremental AI dollar comes from net-new IT budget expansion, while the remaining 72 cents is being pulled from somewhere else inside the enterprise. Existing software budgets, IT services, systems integrators, BPO contracts, headcount savings or slower hiring, license consolidation, cloud expansion, and business-unit budgets all show up in the mix. We call this out because <strong>a dollar added to the IT envelope behaves differently from a dollar taken out of a software renewal or services contract</strong>. Enterprise AI increasingly contains both at the same time, which is why the vendor read-through is becoming more selective than the aggregate spending numbers imply.</p><p>This is also where the pricing question gets more complicated. CIOs are asking which tools can be funded from an existing cost pool, which workflows have enough measurement discipline to justify the spend, and which vendors are charging an AI premium without changing the underlying job being done. A vendor with AI attached to a governed workflow, tied to a measurable operating metric and funded from services avoidance, license consolidation, or cycle-time compression, has a stronger pricing claim than a vendor adding AI on top of a seat count the customer is already trying to rationalize.</p><p>The early ROI evidence still points toward throughput before broad labor replacement. Respondents report higher output with the same headcount, faster project delivery, lower external services spend, and slower hiring before large-scale headcount reduction. That means the first financial evidence of AI is more likely to show up in services-line compression, faster internal project cycles, reduced outside implementation capacity, slower backfill behavior, and lower unit cost inside support, development, analytics, and IT operations. Headcount reduction may come later in some categories, but it is not the first or cleanest marker of value.</p><p>The incumbent read is therefore conditional on workflow depth. Cloud infrastructure, cybersecurity, data platforms, and productivity suites screen positively because they own interfaces and control points AI must traverse to be useful and governable. Categories tied more closely to seat access, content production, manual routing, summarization, or labor-heavy services carry more risk because AI can compress the work without requiring replacement of the system of record. The full report works through that category split in detail, because the market can treat both as &#8220;software,&#8221; while CIO budget behavior is already separating the two.</p><p>Agentic AI fits the same logic. The priority is clear, but production deployment remains concentrated in bounded workflows where approval paths, data access, and auditability can be controlled. The constraints are increasingly operational: data readiness, integration, identity, permissioning, auditability, security, exception handling, and cost predictability. That suggests the next layer of enterprise AI spend should accrue not only to model access or generic copilots, but to the control plane that makes AI deployable at scale.</p><p>The practical implication is that enterprise AI should now be analyzed through budget formation and adoption-profile behavior, not adoption alone. The evidence increasingly says AI is economically meaningful, but the value will not distribute evenly across the stack or arrive uniformly across enterprise cohorts. The next phase of diligence is identifying which budget lines AI is consuming, which workflows are producing measurable returns, and which vendors can convert those returns into pricing power without triggering procurement pushback. That is where AI moves from usage to earnings power.</p><p><em><strong>The E/AI Index is Creative Strategies&#8217; recurring CIO/CTO research series tracking how enterprise AI moves through adoption profiles, budget formation, deployment maturity, workflow ROI, and vendor selection.</strong></em></p><h2><strong>What paid subscribers get in the full report</strong></h2><ul><li><p><strong>The full E/AI Index dataset and adoption-profile read</strong>, including how progressive early adopters differ from the broader enterprise market and why the long tail of CIO behavior matters for sizing where we are in the AI cycle.</p></li><li><p><strong>A budget-source model for the incremental AI dollar</strong>, separating net-new IT budget expansion from dollars being reallocated out of software, services, BPO, headcount, license consolidation, cloud, and business-unit budgets.</p></li><li><p><strong>A category-level spend-intent map across the enterprise stack</strong>, showing where CIOs expect to increase spend, where budgets are being reviewed, and which categories screen as most exposed to AI-driven substitution.</p></li><li><p><strong>A framework for separating real AI ROI from usage theater</strong>, focused on workflows with measurable baselines, budget owners, governance requirements, and operating denominators that procurement can actually defend.</p></li><li><p><strong>Our read on why services compression comes before broad application replacement</strong>, including the specific kinds of SI, managed-services, implementation, testing, documentation, and support work most exposed to AI-enabled internal teams.</p></li><li><p><strong>An incumbent-risk model based on workflow depth</strong>, distinguishing platforms with identity, data context, governance, and execution control from vendors more exposed to seat access, manual routing, content production, or shallow workflow attachment.</p></li><li><p><strong>Updated agentic AI deployment data</strong>, including the gap between pilots, employee-assist workflows, narrow execution under human approval, and true multi-step production agents across systems.</p></li><li><p><strong>A control-plane spending map</strong>, covering the data readiness, identity, security, auditability, observability, workflow integration, and cost-predictability layers that CIOs say are now gating broader deployment.</p></li><li><p><strong>A risk hierarchy for enterprise AI adoption</strong>, including where CIOs are most concerned about data leakage, reliability, regulatory exposure, shadow AI, proprietary workflow exposure, and model-cost unpredictability.</p></li><li><p><strong>The full read-through for 2026 and 2027</strong>, including what we are watching, what would change our view, and which signals would indicate AI is moving from controlled production into broader budget reallocation across the enterprise.</p></li></ul><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/the-eai-index-budget-architecture">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[From Model Wars to Platform Wars]]></title><description><![CDATA[Who Owns the Enterprise AI Control Plane?]]></description><link>https://www.thediligencestack.com/p/from-model-wars-to-platform-wars</link><guid isPermaLink="false">https://www.thediligencestack.com/p/from-model-wars-to-platform-wars</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 07 May 2026 14:57:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KJrr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e2b1e42-167f-4c60-b251-b3b58216bf91_2200x1362.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This report is the natural follow-on to our recent note, From AI Usage to AI Earnings Power. </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;d33a68e3-0f14-4c90-9e29-a51622b0e54c&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;From AI Usage to AI Earnings Power&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-05-05T14:50:20.642Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c391c28-b04b-4830-b0d4-63fd555a0372_2816x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/from-ai-usage-to-ai-earnings-power&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:196440058,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:6,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>That note argued that the first investable AI cycle is forming inside dense workflows where work is repeated, measured, permissioned, and close to execution. Customer ROI is the bridge from usage to monetization, and the durability of AI software, model, and infrastructure spend depends on whether AI changes the economics attached to a repeated unit of work.</p><p>This report takes the next step. If customer dollars begin moving inside those workflows, the value-capture question becomes which layer of the enterprise stack captures them as deployments convert from pilot to production. The platform competition is the immediate consequence of the AI ROI discussion. Platform control follows budget formation.</p><p>The model race still remains for capability and lab-layer unit economics. The customer relationship increasingly depends on the interfaces that own permissions, telemetry, audit, and the budget conversation once agents do real work.</p><h3>From ROI Proof To Platform Control</h3><p>Our prior report focused on which workflows justify enterprise AI spend. This report focuses on who captures the economics once those workflows move into production.</p><p>Three points connect the two reports. AI ROI evidence is forming inside a more specific set of workflows than the broad enterprise AI narrative implies, and those workflows share a baseline, a budget owner, and a measurable economic unit of work. They are also where execution, permissioning, and audit are most concentrated, which makes them the workflows where platform position is most contested. Value capture follows the layer that proves and prices the completed work. Model supply and interface ownership also matter, but they do not automatically determine where the durable economics settle. Usage and revenue can grow at different layers at the same time, and the long-duration economics tend to accrue to the layer the customer associates with budget movement.</p><h3>The Three Control Points</h3><p>We map the value-capture question across three layers. The model, runtime, and interface layer is where foundation model vendors compete to become the default intelligence behind agentic workloads and, increasingly, the entry point where user intent enters the system. The orchestration and workflow layer turns model calls into completed business outcomes. The system-of-record and governance layer owns identity, entitlement, approval, and audit. Economics do not always accrue where compute is consumed. They accrue where enterprises standardize the system of action.</p><p>OpenAI is executing a horizontal strategy built on distribution and intent capture, and the test is whether horizontal usage becomes recurring workflow ownership. Anthropic is executing a narrower production-credibility strategy, with coding as the strongest current ROI wedge, and the test is whether that wedge travels into other measurable-ROI workflows. Neither answer is fully proven against the workflow evidence we have so far. See our deep dives on both companies. </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;b2b8b4cd-301d-4d87-9f8f-6b3d0dc17277&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Anthropic and the Intelligence Utility Thesis&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100},{&quot;id&quot;:121923779,&quot;name&quot;:&quot;Max Weinbach&quot;,&quot;bio&quot;:&quot;Analyst @ Creative Strategies&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/594247e0-2184-43ce-9740-00e8a111312e_399x399.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-02-26T18:36:00.353Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!M_3A!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7bd5c2d3-a2a8-490b-b2ae-7e20e98ad142_1270x774.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/anthropic-and-the-intelligence-utility&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:188444088,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;92fd709a-3dba-4de8-a6e6-a90d67754e53&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;OpenAI: Three Engines, One Platform&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-03-16T15:52:59.626Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!06Ap!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16cd1798-8643-431e-90de-c8e0c33b0a1c_1082x776.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/openai-three-engines-one-platform&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:191133523,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h3>Why Coding Is The Operational Laboratory</h3><p>Coding is the first workflow where AI ROI, workflow ownership, and governance bottlenecks are visible at the same time. The bottleneck has moved from authoring to review, integration, security, and policy. We treat coding as the live laboratory for how the value-capture question is likely to evolve in legal, finance, customer operations, regulated servicing, and analytics, not as the full market.</p><h3>Three Cross-Cutting Themes</h3><p>Pricing is moving from seats toward hybrid seat, action, and outcome models that price the economic unit of work. Enterprises are multi-model by design and consolidating by workflow, which is not the same thing as consolidating by vendor. The labor mix is moving toward a new role we describe as the AI orchestrator.</p><p>The market can support more than one winner across layers. The model, runtime, and interface layer likely supports two to three franchises with operating credibility. The orchestration and workflow layer likely supports a handful of category leaders. The governance layer should consolidate toward a smaller set of platforms that own identity, entitlement, and audit. Reading the question as a binary between OpenAI and Anthropic risks missing where customer ROI dollars are accumulating inside the workflows we tracked in the prior note. The next question is whether buyers are ready to fund those workflows at scale, which is where our CIO/CTO survey work picks up next week.</p><h3>What Paid Subscribers Get In The Full Report</h3><blockquote><p>&#8226; Our three-layer map of the enterprise AI control plane, with the criteria we use to judge which layer captures lasting margin in each category.</p><p>&#8226; A direct mapping from the AI ROI evidence in the prior report to platform position, including which workflows are most likely to define the next wave of value capture.</p><p>&#8226; A variant-perception section: what consensus believes, what is underappreciated, and where this report differs.</p><p>&#8226; A side-by-side read on OpenAI and Anthropic, sharpened around distribution and intent capture versus production credibility and utility economics, with the conditions that would cause us to reweight each.</p><p>&#8226; The counterthesis on workflow and system-of-record incumbents, with a named list of vendors we view as structurally advantaged in the agentic cycle once ROI is proven.</p><p>&#8226; Our triangulated view of frontier-lab revenue, enterprise penetration, workload share, and growth, expressed as ranges rather than point estimates and marked as triangulated rather than company-guided.</p><p>&#8226; An expanded read on Claude Code and coding agents, including what ROI, bottleneck migration, pricing dynamics, and workflow ownership tell us about where the next high-value workflow lands.</p><p>&#8226; Pricing architecture in practice, with the seat, action, outcome, and work-unit combinations we are seeing across CRM, ITSM, developer, and security suites.</p><p>&#8226; The buyer behavior view, including why multi-model by design does not prevent consolidation at the workflow layer.</p><p>&#8226; Token economics and unit cost curves, including how we think about deflation, call volume, and gross margin trajectory for model vendors and platforms.</p><p>&#8226; Labor and org design, including the AI orchestrator role, the change in junior hiring, and the productivity and quality tradeoffs we see in main-branch work.</p><p>&#8226; The security and governance read, including why AI-generated code failure rates matter for enterprise budget decisions and how that lesson generalizes to adjacent workflows.</p><p>&#8226; A diligence approach for evaluating any company in the stack, with the questions we ask and the red flags we weight most heavily.</p><p>&#8226; A twelve-month watch list across labs, platforms, incumbents, and private companies, with specific and measurable signals.</p><p>&#8226; What must be true for OpenAI, Anthropic, and the counterthesis to each hold, expressed as testable conditions rather than narratives.</p><p>&#8226; What would make us wrong, expressed as explicit falsification conditions with observable signals investors can track.</p><p>&#8226; The value-capture implications for public platforms, private labs, and the enterprise software stack, expressed as analytical weighting and indicators to track.</p></blockquote><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/from-model-wars-to-platform-wars">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[From AI Usage to AI Earnings Power]]></title><description><![CDATA[Agentic AI&#8217;s first investable cycle is forming around measurable work, workflow control, and budget formation]]></description><link>https://www.thediligencestack.com/p/from-ai-usage-to-ai-earnings-power</link><guid isPermaLink="false">https://www.thediligencestack.com/p/from-ai-usage-to-ai-earnings-power</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Tue, 05 May 2026 14:50:20 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1c391c28-b04b-4830-b0d4-63fd555a0372_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em><strong>Research Series Note:</strong> We are spending the next several reports on the AI monetization question from three angles. Today&#8217;s note focuses on what customer evidence is telling us about AI ROI. Thursday&#8217;s report will look at how we see the AI platform war evolving as value capture moves from models to workflows, data, and control points. Next Tuesday, we will publish findings from a CIO/CTO survey we collaborated on, focused on AI ROI, budget formation, and how enterprise buyers are deciding which deployments deserve more funding.</em></p><div><hr></div><p><br>Over the better part of the last two years, we have tracked enterprise AI adoption as it moved from experimentation to early production, with the ROI discussion becoming more central as deployments moved closer to operating budgets. The process is still early, and we would be careful not to overread any single customer story or survey result, but we are at the point where AI has clearly moved from capability to economic proof. Most companies no longer need to be convinced that AI can generate useful output, improve workflows, or create new product experiences. The harder issue is whether those improvements are large enough, repeatable enough, and measurable enough to justify the next layer of spending across software, models, services, and infrastructure. That is where the market&#8217;s AI narratives start to miss the economic question. Usage shows that a product has distribution. ROI shows whether the customer has a reason to keep funding it. We outlined the software monetization model needed for AI in the report below. </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;a03ac32c-b29e-4917-9921-c95e021e635c&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;AI Infrastructure Economics: The $2-for-$1 Problem&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100},{&quot;id&quot;:121923779,&quot;name&quot;:&quot;Max Weinbach&quot;,&quot;bio&quot;:&quot;Analyst @ Creative Strategies&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/594247e0-2184-43ce-9740-00e8a111312e_399x399.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-02-02T19:13:08.197Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!AjsF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3435c58-665a-4c60-99c2-c4513b5e3985_970x742.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/ai-infrastructure-economics-the-2&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:183463877,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:14,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>That is why we went through the exercise of collecting tangible AI ROI use cases rather than relying on broad adoption surveys (we have that data as well) or vendor-level attach commentary. We are sensitive to the broader question of AI cycle durability, and one of the key variables in that discussion is whether customers can point to enough measurable value to justify the spending now being built into software roadmaps, frontier-lab revenue expectations, and infrastructure deployment plans. The durability of this cycle will not be determined by capability alone. <strong>It will depend on whether AI becomes economically useful enough inside customer workflows to support continued budget formation.</strong></p><p>That distinction is becoming more important as AI moves deeper into enterprise budgets. Software vendors need customer ROI to defend premium SKUs, higher attach rates, and usage-based pricing. Frontier labs need enterprise ROI to support consumption growth and the revenue expectations now embedded in the category. Infrastructure vendors need the same proof because the broader AI capex cycle ultimately depends on customers being able to convert compute into business value. CIOs and CFOs are also changing how they evaluate AI projects. The experimentation phase is not over, but the next wave of spending will face a more practical test: which workflows are being repriced, restaffed, accelerated, or made cheaper to run?</p><p>Our latest research note looks at agentic AI through that lens. Our conviction, as of now, is that the better read is that the first investable ROI cycle is forming in a more specific set of workflows than the broad enterprise AI narrative implies. <strong>The strongest evidence appears where AI is tied to repeated work with a measurable baseline and a budget owner already attached.</strong> Contact centers, regulated servicing operations, IT access management, developer workflows, and enterprise search or context layers all prove out well because they have visible denominators: calls, tickets, access requests, summaries, code cycles, compliance reviews, or knowledge retrieval tasks. When AI changes the cost, speed, quality, or capacity of those units of work, the monetization claim becomes easier to evaluate.</p><p>Outside that group, the evidence is still mixed. Sales, RevOps, HR, and finance back-office workflows have large budgets and repeated tasks, but attribution, exceptions, permissions, and liability make automation harder to underwrite. Broad knowledge-worker assistants may show usage and time saved, yet those metrics often stop short of proving a funded operating change. Fully autonomous cross-enterprise agents remain even earlier, with reliability, identity, data integration, and liability still limiting the move from interface to execution. The evidence is still incomplete, but the direction is worth taking seriously.</p><p>For stakeholders, we believe the economic issue is budget formation (as we will show in our CIO/CTO survey). The customer ROI test starts with the spend pool the agent is permitted to influence. Labor in a contact center, after-call work in a servicing operation, IT ticket queues, access management, developer capacity, onboarding, enterprise search, and compliance documentation are all different budget conversations. The more measurable the work, the easier it is for the customer to defend spend and for the vendor to price against value created.</p><p>That also changes how we think about value capture. The agent interface may not (more on this Thursday) always be the economic control point. In some workflows, the application vendor controls the system of action. In others, the data and context layer gets funded first because enterprises need governed knowledge before agents can act. Identity and security vendors may become more central as agents behave like non-human actors inside enterprise systems. Model providers can see consumption grow while the workflow economics accrue to applications, orchestration layers, or internal routing systems. Services firms may benefit from data cleanup and integration near term while facing pressure later if AI automates repetitive support, QA, migration, or maintenance work. For more on this, see our report on who has competitive moats in SaaS.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;2a240532-4729-402a-b910-eebaf6008e6e&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Who is Safe in SaaS? The Lens Via our Data Moat Scorecard&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-02-24T18:13:29.464Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!SLvf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39bc87f2-edc6-4282-b43d-e6a248c8a0c6_1116x834.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/who-is-safe-in-saas-the-lens-via&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:187152368,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:5,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>The practical implication is that AI ROI should be evaluated at the workflow level before it is extrapolated to the enterprise software stack. The cycle that earns the most analytical weight first will be the one where the productivity claim and the monetization claim become the same claim. That is the core focus of this report.</p><h2><strong>Paid subscribers get the full report, including:</strong></h2><ul><li><p><strong>A ranked evidence ladder</strong> separating stronger operating-economics cases from projected, anecdotal, infrastructure, and risk evidence.</p></li><li><p><strong>Five primary customer case studies</strong> across healthcare call centers, regulated mortgage servicing, IT access management, developer workflows, and enterprise search/context.</p></li><li><p><strong>A workflow-density heat map</strong> showing which enterprise AI use cases have the clearest near-term ROI visibility and which remain harder to underwrite.</p></li><li><p><strong>A customer-story-to-budget-formation table</strong> mapping each use case to the affected budget line, likely control point, public-company read-through, and durability test.</p></li><li><p><strong>A value-capture layer map</strong> covering workflow owners, data/context platforms, developer platforms, identity and security, model providers, SIs, and vertical AI vendors.</p></li><li><p><strong>An earnings-call tracking dashboard</strong> focused on production conversion, paid attach versus bundling, workflow-level economics, data-readiness pull-through, agent governance, developer-tool durability, services mix, and pricing-model evolution.</p></li><li><p><strong>A risk framework</strong> for where the market may be overgeneralizing, including broad copilot adoption, AI attach, seat-based SaaS pricing, SI exposure, model-layer value capture, and data-readiness bottlenecks.</p></li><li><p><strong>The broader AI monetization read-through</strong> tying customer ROI evidence to software pricing, frontier-lab consumption, and the durability of infrastructure spend.</p><p></p></li></ul><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/from-ai-usage-to-ai-earnings-power">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[SanDisk’s NBM Moment: A Different Commercial Model for Memory and Storage]]></title><description><![CDATA[SanDisk&#8217;s quarter produced extraordinary earnings, but the more important signal may be its disclosure around new multi-year customer agreements. In this note, we examine how NBMs, financial guarantees, and contracted future bit supply could change the way investors model NAND, and potentially memory more broadly, as AI infrastructure customers begin treating storage access as a strategic requirement rather than a quarterly procurement decision.]]></description><link>https://www.thediligencestack.com/p/sandisks-nbm-moment-a-different-commercial</link><guid isPermaLink="false">https://www.thediligencestack.com/p/sandisks-nbm-moment-a-different-commercial</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Fri, 01 May 2026 15:00:35 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d09184d3-4f1c-47a9-997a-d2f981cbd4d3_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>SanDisk&#8217;s earnings were extraordinary by any normal semiconductor standard, however, what stood out most from the quarter was the company&#8217;s disclosure around new multi-year customer agreements. SanDisk&#8217;s NBM updated disclosure gives us a useful early view into how the commercial structure of storage (and memory) may evolve as AI infrastructure becomes a larger share of demand. The historical model for NAND has been built around bit supply, utilization, spot pricing, inventory, and capex discipline. Those variables don&#8217;t go away, although they no longer capture the full economic structure if customers are willing to reserve future supply and attach financial commitments to that demand. The more useful forward framework is likely to include contract coverage, enforceability, pricing structure, renewal cadence, and the portion of future bits already tied to customer infrastructure plans.</p><p>The nature and size of SanDisk&#8217;s disclosure are what make the shift worth highlighting, and contemplating how this structure may fundamentally change memory and storage contracts going forward. Management said it has signed five multi-year NBMs to date, with more than one-third of FY27 bits already under firm customer commitments, more than $11 billion of financial guarantees, and roughly $42 billion of minimum contractual revenue from only the three agreements signed during the quarter. The agreements include quarterly volume commitments, a mix of fixed and variable pricing, and durations that can extend up to five years. We call this out because they take the focus off near-term pricing strength and say more about <strong>the value customers are placing on assured future access.</strong> For SanDisk, the benefit is better visibility into consumption, allocation, mix, and margin durability.</p><p>The supplier-customer mismatch is the key point and functional change. SanDisk runs a fab-based model with relatively consistent output, while customers have historically wanted supply assurance and quarterly pricing optionality at the same time.<strong> </strong>Management described the new structure as a way to obtain &#8220;certainty of economics,&#8221; which we think is the most useful phrase from the call. A supplier can make different decisions around allocation, inventory, capex, and customer mix when demand is committed and financially backed rather than forecasted and repriced every quarter. The customer also receives a more reliable supply path for infrastructure plans that are becoming harder to adjust at the last minute. As stated, we believe this structure becomes a normal environment for all those we label as masters of the supply chain with regard to storage and memory, <strong>and very likely the entirety of the semiconductor supply chain. </strong></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;d1383f77-ec28-44d5-92ef-c025b745e85e&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Masters of the Supply Chain&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-03-12T15:59:26.652Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!zuM1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62cb2ed2-6774-4fc1-bd1c-bc42fa48e670_1162x638.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/masters-of-the-supply-chain&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:190217466,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:3,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>As should be obvious, AI is the demand mechanism behind the change. But more specifically, as we have been calling out with regard to memory and storage, it is the co-designed nature of AI-related memory and storage content that is driving this change. For that reason, management connected the NBM structure to inference, longer context, KV cache, RAG, and agentic systems, all of which increase the need for high-performance, low-latency flash inside AI infrastructure. In that environment, NAND is becoming more integrated into the AI factory because systems need to retain context, intermediate data, and external datasets around the model. When customers commit years of demand against those requirements, it tells us storage access is becoming valuable enough to reserve in advance, especially when the cost of being wrong on supply can affect broader infrastructure deployment. We detail all that is going on with storage in co-optimized AI infrastructure in our deep dive on storage. </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;acbc7935-5658-42c8-bef4-e015057a41d6&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Storage Wars: When Memory and Storage Collapse Into One Layer&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-04-07T15:07:47.695Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a7337085-4659-43b1-89ba-502fab16c373_2752x1536.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/storage-wars-when-memory-and-storage&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:193076442,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:7,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>We would be careful with any claim that cyclicality is over as a whole. NAND and DRAM will still have pricing cycles, inventory corrections, supply responses, and periods of digestion after customers pull demand forward. The better observation, and which is our conviction, is that AI infrastructure may be changing the shape and severity of those cycles. The old model allowed customers to preserve optionality while suppliers absorbed most of the volatility. This model moves part of that volatility back to customers through committed demand, financial guarantees, and purchase obligations. If these structures broaden, the cycle becomes less dependent on quarterly spot-price negotiation and more dependent on how much future supply has already been allocated under enforceable commitments. The latter is how we see this playing out. </p><p>The broader read-through is that SanDisk may be the clearest public example of a larger shift already forming across memory and storage. The same logic should apply to HBM, high-capacity DRAM, and other AI-tied memory configurations where future access is even more strategically important. Hyperscalers, model labs, and AI infrastructure operators are planning GPU clusters and inference capacity years in advance. That planning increasingly requires a memory and storage stack with similar visibility. Customers may still prefer flexibility, although the cost of being under-allocated is rising as AI systems become more dependent on specific memory and storage configurations.</p><p>For stakeholders, the modeling framework should expand beyond near-term ASPs and bit growth. Contracted bit coverage, RPO-like disclosure, financial guarantees, fixed versus variable pricing exposure, customer renewal behavior, and the maturity ladder of agreements become more useful indicators of earnings quality. The key debate moves from whether current earnings represent peak conditions to how much of the earnings base is supported by customer behavior that looks structurally different from prior cycles.</p><p>Our view is that SanDisk&#8217;s NBM disclosure is an early sign that memory and storage are moving toward a different commercial architecture. The market has historically discounted peak memory earnings because the cycle eventually gave them back. If a larger portion of forward supply becomes contracted, enforceable, and tied to multi-year AI infrastructure demand, then durability becomes a larger part of the discussion. <strong>That would be a materially different way to model storage and memory over the next several years, or longer. </strong>For<strong> </strong>further reading, see our reports below. </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;f46898ba-8d1b-456b-ac41-0b159ea8e274&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The Next Debate in Memory Is Duration, Not Demand&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-04-02T15:25:36.698Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b5af0853-05a9-4a9e-9f6c-592250f7ff80_888x486.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/the-next-debate-in-memory-is-duration&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:192963396,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:9,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;cb76a707-561e-4e24-b939-873eeaa96201&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Memory's $200B Inflection&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-02-19T17:15:37.190Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!rSQ7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d87bc96-3327-41e2-955f-511bac8f0f89_1244x890.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/memorys-200b-inflection&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:187534208,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:11,&quot;comment_count&quot;:1,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div>]]></content:encoded></item><item><title><![CDATA[Microsoft’s AI Capex Is Buying the Enterprise Agent Stack]]></title><description><![CDATA[How compute allocation turns AI infrastructure spend into enterprise software ARPU]]></description><link>https://www.thediligencestack.com/p/microsofts-ai-capex-is-buying-the</link><guid isPermaLink="false">https://www.thediligencestack.com/p/microsofts-ai-capex-is-buying-the</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Thu, 30 Apr 2026 17:13:34 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!7v_t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>After completing AI growth thesis reports on Amazon and Google, both of which held up very well in light of both companies&#8217; earnings reports yesterday, we now turn our attention to Microsoft. </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;2e206826-a8d1-45ca-843d-19a4aa40152d&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The AWS Acceleration Thesis: Why Amazon's Cloud Business May Be Entering Its Most Consequential Growth Phase&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-02-17T14:22:59.763Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/$s_!E3Ay!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f23f2f-b495-4b92-8e57-a9b9e9e10c94_1200x732.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/the-aws-acceleration-thesis-why-amazons&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:183382817,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;25be4da4-1a65-4033-a233-c29ce17437bb&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Google&#8217;s AI Capex Is Being Measured Against the Wrong Revenue Line&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-04-23T15:25:49.147Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b610a527-9d2c-4a69-88d5-f4bf9ee2495c_1024x1024.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/googles-ai-capex-is-being-measured&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:186999354,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Microsoft&#8217;s AI debate has been too centered on the visible parts of the model: Azure capacity, GPU availability, capex, and cloud gross margin. Those are the right places to start, but they do not fully explain the growth thesis. The issue we keep coming back to is allocation. Microsoft does not have unlimited deployable AI capacity, and every GPU pushed toward one workload has an opportunity cost somewhere else. Some capacity is sold externally through Azure. Some support OpenAI-related demand. Some is consumed internally by Microsoft 365 Copilot, GitHub, Fabric, Foundry, Dynamics, Security, and the agent-governance layer now forming across the enterprise stack. That allocation decision is where the earnings power debate sits.</p><p>Q3 gave more support to the view that Microsoft&#8217;s AI numerator is expanding. AI annualized revenue run rate surpassed $37 billion, Azure grew 40%, and Microsoft 365 Copilot crossed 20 million paid seats. We would not treat those as separate proof points. They point to the same mechanism: Microsoft is turning AI capacity into infrastructure consumption, software seats, developer usage, data-platform pull-through, and agent tooling. This is why Azure-only return math is the wrong singular focus. It captures the most visible revenue stream, while missing the monetization that shows up across the software estate.</p><p>We understand the attention high fixed costs get as a part of this cycle. Capex remains elevated, Q4 spend is guided higher, and calendar-year 2026 capex is expected to reach roughly $190 billion, including about $25 billion tied to component inflation rather than incremental capacity. Microsoft Cloud gross margin compressed to 66%, with Q4 guided to roughly 64%. We recognize the margin pressure, but the key question is whether Microsoft&#8217;s software monetization scales quickly enough to offset the denominator the market can already see. If Microsoft can convert scarce compute into Microsoft 365 ARPU, GitHub usage, Fabric consumption, Security attach, and agent-governance revenue, this becomes a broader enterprise software monetization cycle rather than a pure Azure capacity build.</p><p>The broader cloud RPO and backlog data help explain why the debate over whether the AI infrastructure cycle is durable remains flawed. Across AWS, Google Cloud, and Microsoft, the forward revenue base tied to cloud and AI infrastructure has moved from steady compounding into a steeper AI-era slope. The numbers are not perfectly comparable: AWS discloses long-term performance obligations primarily related to AWS, Alphabet reports remaining performance obligations, or &#8220;revenue backlog,&#8221; primarily related to Google Cloud, and Microsoft reports total commercial RPO rather than Azure-specific backlog. Even with those caveats, the direction is useful. AI demand is now showing up in contracted revenue visibility across the major cloud platforms, adding another layer of evidence beyond management commentary, quarterly growth rates, and capacity announcements.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7v_t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7v_t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png 424w, https://substackcdn.com/image/fetch/$s_!7v_t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png 848w, https://substackcdn.com/image/fetch/$s_!7v_t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png 1272w, https://substackcdn.com/image/fetch/$s_!7v_t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7v_t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png" width="996" height="590" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e181a598-f728-457f-8622-183ad1468ec7_996x590.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:590,&quot;width&quot;:996,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:158932,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://thediligencestack.com/i/195953320?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7v_t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png 424w, https://substackcdn.com/image/fetch/$s_!7v_t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png 848w, https://substackcdn.com/image/fetch/$s_!7v_t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png 1272w, https://substackcdn.com/image/fetch/$s_!7v_t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe181a598-f728-457f-8622-183ad1468ec7_996x590.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The backlog inflection provides the industry evidence behind the durability debate. AWS and Google confirm that AI infrastructure demand is broadening across hyperscale cloud, but Microsoft&#8217;s growth thesis turns on a familiar question: whether scarce compute allocated internally can convert into software ARPU, usage meters, Fabric pull-through, security attach, and agent-governance control points. That is why the Microsoft debate cannot stop at backlog, Azure growth, or capacity additions.</p><p>The first-party allocation thesis remains a key part of the core angle. Our working model assumes roughly 30% of newly deployed GPU capacity is going to first-party workloads and roughly 70% to external Azure customers, triangulated from channel work and consistent with Q3 commentary. Internal capacity allocated to Microsoft 365 Copilot, GitHub Copilot, Foundry, and Dynamics AI does not appear as third-party Azure revenue. It monetizes through seat ARPU, data pull-through, workflow attach, and governance. The cost is visible in cloud gross margin today, while the software return is still scaling across Microsoft 365 Commercial Cloud and the broader enterprise estate.</p><p>Customer evidence is moving in Microsoft&#8217;s direction, but adoption is still early. CIO and channel work places AI at roughly 8&#8211;10% of IT budgets, with most large organizations now carrying a dedicated AI budget line. Funding is coming from both reallocation and incremental budget, and the reallocation pressure appears heavier on IT services and systems integrators than on cloud infrastructure or cybersecurity. That mix favors Microsoft&#8217;s protected cloud and security layers and its packaged AI software model. The constraint is production maturity, with channel checks still showing GenAI rollout in the single digits and most customers in pilot phases.</p><p>We understand the risks that still exist. Microsoft Cloud gross margin is under pressure, component inflation is now a material capex variable, and bookings will remain noisy as the OpenAI comparison rolls through the model. The revised OpenAI partnership improves Microsoft&#8217;s economics on OpenAI-derived workloads through retained equity, revenue-share payments through 2030 subject to a cap, continued non-exclusive IP rights through 2032, and the removal of Microsoft&#8217;s revenue-share obligation to OpenAI. The offset is that OpenAI now has more freedom to serve products across other clouds.</p><p>We would evaluate Microsoft&#8217;s AI capex as a portfolio monetization cycle, not only as an Azure capacity cycle. Azure is the infrastructure base, but the return path runs through Microsoft 365 Copilot, E5, the announced E7 frontier suite, Fabric, GitHub, Dynamics, Security, and the Agent 365 / Copilot Studio governance layer. The infrastructure cost is already visible in the income statement, while software and governance ARPU remain earlier in their curve. The next four quarters should show whether Microsoft&#8217;s user-plus-usage transition accelerates faster than depreciation pressure weighs on cloud gross margin. Microsoft&#8217;s AI capex is buying the foundation for a broader enterprise software pricing and control layer.</p><h3><strong>What subscribers get in the full note</strong></h3><ul><li><p>Full multi-vector AI ROIC framework, including why Azure-only return math understates the numerator after the Q3 print.</p></li><li><p>Updated earnings analysis covering AI ARR, Azure growth, capex composition, capacity, and the OpenAI restructure.</p></li><li><p>First-party versus third-party GPU allocation, refreshed with the Q3 allocation language.</p></li><li><p>Customer-side validation triangulated across CIO, partner, and channel work on AI budget formation, vendor positioning, and deployment maturity.</p></li><li><p>Azure growth, capex, margin, and capacity model, including component-price disclosure and Q4 trajectory.</p></li><li><p>Copilot, E5, and the E7 frontier suite as a layered ARPU model.</p></li><li><p>Fabric and the data pull-through layer underneath enterprise agents.</p></li><li><p>Agent 365 and Copilot Studio as the enterprise agent governance and control plane.</p></li><li><p>OpenAI concentration, the new revenue-share structure, multi-cloud risk, and Anthropic integration.</p></li><li><p>Competitive framework versus AWS and Google, scenario framework, and a metric monitoring dashboard for the next four quarters.</p></li></ul><p></p>
      <p>
          <a href="https://www.thediligencestack.com/p/microsofts-ai-capex-is-buying-the">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Custom ASIC Is No Longer One Market]]></title><description><![CDATA[Scope ownership, attach leverage, and the separation of profit pools in AI infrastructure]]></description><link>https://www.thediligencestack.com/p/custom-asic-is-no-longer-one-market</link><guid isPermaLink="false">https://www.thediligencestack.com/p/custom-asic-is-no-longer-one-market</guid><dc:creator><![CDATA[Ben Bajarin]]></dc:creator><pubDate>Tue, 28 Apr 2026 16:08:29 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f4ee99b2-98f9-409d-b2ab-8ea4b550d1b1_2400x1792.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Coming out of discussions at Google Cloud Next last week, we expect custom AI silicon to move back toward the center of the AI infrastructure debate. The spend trajectory across hyperscalers, dedicated infrastructure operators, and frontier model labs continues to support our view that custom silicon remains a key component of the buildout. The next phase of the discussion should focus on how that spend is captured, which layers of scope carry durable economics, and where strategic control shifts as customers become more sophisticated.</p><p>The public conversation still tends to group custom ASIC exposure into a single market narrative at a point when the underlying market is becoming more layered. Custom ASIC now covers programs with meaningfully different economics, margin structures, durability, and control points. That shorthand was useful when the category mostly referred to a compute die. The category now increasingly describes full systems spanning compute, memory, networking, I/O, packaging, and integration.</p><p>That shift changes how we frame the opportunity. Vendor exposure should be evaluated by scope quality: which layers of the program are owned, how scarce those layers are, how much execution risk they remove, and whether the role can support durable earnings quality as hyperscalers retain more architectural control internally.</p><p>Across the cohort, the businesses being grouped together are doing very different work: some vendors are selling broad execution ownership across compute, packaging, and networking, others are selling attach, others are selling I/O and modular implementation, and others are monetizing physical design, foundry adjacency, and packaging coordination. Hyperscalers will continue to outsource what is scarce, risky, or time-sensitive, while continuing to push to reclaim the layers where internal ownership lowers cost or improves control, and the variable investors should be tracking is which vendor owns which layer of the stack, how durable that scope is, what attach travels alongside it, and where insourcing pressure is most likely to land first.</p><h2>Why this matters now</h2><p>Google&#8217;s most recent TPU roadmap is the most timely evidence for this view. By separating the eighth-generation TPU family into training-oriented and inference-oriented chips, Google is signaling that workload classes are now diverging at the silicon level. We expect that divergence to deepen as training systems continue to optimize around scale, synchronization, memory bandwidth, and reliability, while inference systems optimize around latency, utilization, cost per token, and deployment flexibility.</p><p>That divergence should also change supplier allocation. The full partner structure across these systems is not completely disclosed, but supply-chain checks point to a more layered model in which different external partners participate in different parts of the stack while Google retains significant internal architectural ownership. The same direction of travel is visible across AWS, Microsoft, Meta, OpenAI, and Anthropic. Each customer appears to be making its own decision about where internal architecture matters most and where external execution, IP, or capacity can create leverage.</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;3fa2e5c3-df32-4437-9a70-cc1ba890a222&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Google&#8217;s TPU Strategy Offers a Clearer View of the Next AI Bottleneck&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-04-22T16:44:53.493Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f110929-c5d5-486a-95c6-b95edef62dd0_1598x814.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/googles-tpu-strategy-offers-a-clearer&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:194996949,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:9,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;7c9c670d-3b15-42c1-97f3-0f24b279c573&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Google&#8217;s AI Capex Is Being Measured Against the Wrong Revenue Line&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:21971657,&quot;name&quot;:&quot;Ben Bajarin&quot;,&quot;bio&quot;:&quot;CEO&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc186a30-2fc0-4b79-ad09-869042c38eac_772x772.png&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100}],&quot;post_date&quot;:&quot;2026-04-23T15:25:49.147Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b610a527-9d2c-4a69-88d5-f4bf9ee2495c_1024x1024.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://thediligencestack.com/p/googles-ai-capex-is-being-measured&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:186999354,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:4,&quot;comment_count&quot;:0,&quot;publication_id&quot;:4189414,&quot;publication_name&quot;:&quot;The Diligence Stack - By Creative Strategies&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7nPP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa520e3ad-752d-4d23-aa9a-515b8401a908_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><h2>The framework</h2><p>The full report separates the cohort into five distinct business models, each with its own profit pool and its own relationship to insourcing pressure. Premium full-scope orchestration carries the broadest scope and the strongest margin tier alongside the most concentrated insourcing pressure across the next two to three generations. Flexible compute plus attach can be margin dilutive on standalone compute and is decided on whether attach travels alongside the compute award. Hybrid I/O and modular custom silicon represents a different price point and is the most important business-model experiment in the group. Back-end implementation and turnkey execution carries lower margin and higher beta to ramps, while foundry-adjacent packaging and enablement carries scarcity exposure tied to advanced packaging tightness, with revenue conversion that is more lumpy than headline ASIC framings imply.</p><p>Applied across Broadcom, Marvell, MediaTek, Alchip, and GUC, the framework produces five very different exposures. The margin spread alone runs from blended margins in the teens at the back end of the stack to mid-50s at the premium full-scope orchestration tier. That gap is too wide to treat custom ASIC exposure as economically equivalent across the vendor base. Custom silicon is continuing to be important part of the AI infrastructure buildout, but the next phase of the debate will be decided by scope quality rather than socket wins: who owns scarce IP, who reduces execution risk, and who can attach higher-quality content to the compute program itself.</p><p>A Broadcom dollar, a Marvell dollar, a MediaTek dollar, an Alchip dollar, and a GUC dollar do not carry the same margin structure, durability, or strategic risk. Treating them as comparable assets confuses exposure with exposure quality. The vendors that retain scarce IP, reduce execution risk, or attach higher-quality content to custom compute should have a better chance of holding their economics through the buildout. The vendors with thinner positions will need volume, repeatability, or scarcity to support valuation durability. That distinction is the core of the full report.</p><h2>What full subscribers receive</h2><blockquote><ul><li><p>The full institutional report, including the framework, company sections, scorecards, margin work, and watchlist, written in our voice and structured for buy-side use.</p></li><li><p>The custom silicon stack map with a layer-by-layer view of where economic value, insourcing risk, and strategic leverage actually sit.</p></li><li><p>The five-model business framework covering premium full-scope orchestration, flexible compute plus attach, hybrid I/O and modular silicon, back-end implementation, and foundry-adjacent enablement.</p></li><li><p>Full company sections for Broadcom, Marvell, MediaTek, Alchip, and GUC, including stack ownership, revenue pool exposure, and the central underwriting question for each name.</p></li><li><p>The comparative framework table covering primary role, scope ownership, main revenue pool, margin quality, insourcing risk, attach leverage, customer concentration, and what investors are underwriting.</p></li><li><p>Per-company scorecards summarizing strengths, weaknesses, opportunities, threats, and margin tier.</p></li><li><p>A dedicated margin debate section that traces gross margin tiers across the cohort and explains why hyperscalers will continue to pressure pricing.</p></li><li><p>The Google case study applied to the broader market transition rather than as a single-customer note.</p></li><li><p>A risks section covering what would weaken and what would strengthen the framework.</p></li><li><p>The full What We Are Watching section across the cohort and per name, updated through subsequent notes.</p></li></ul></blockquote>
      <p>
          <a href="https://www.thediligencestack.com/p/custom-asic-is-no-longer-one-market">
              Read more
          </a>
      </p>
   ]]></content:encoded></item></channel></rss>