GM. This is Milk Road AI, where we analyze the businesses, semiconductors, and supply chains behind the AI boom.
Hereâs what weâve got for you today:
- âïž The wafer-scale chip bet shaking up the AI inference race.
- đïž The Milk Road AI Show: ARK Invest: Why SpaceX Wants To Put AI Data Centers In Orbit.
- đȘ OpenAI adds real-time voice translation.
Nexo is back in the U.S. - and new clients get 30 days of Wealth Club Premier perks! Higher yields, lower borrowing rates, and crypto cashback - start here.

Prices as of 10:00 a.m. ET.

THE CHIP THAT REFUSED TO BE SMALL
In 1984, IBM made a decision that seemed completely rational at the time.
They were building the PC and needed a chip.
Instead of making their own, they licensed it from a small company in California called Intel.

Simple, cheap, fast. What could go wrong? But it turns out, everything.
Over the next two decades, Intel became the most important company in computing, and IBMâs decision to outsource the brain of its own product became the most expensive shortcut in business history.
The semiconductor industry has a long, proud tradition of people refusing to follow conventional wisdom, and every decade or so, one of those refusals turns into a trillion-dollar market shift.
We might be watching one happen right now.
Tomorrow, a company called Cerebras Systems will list on the Nasdaq under the ticker CBRS.
The IPO was 20x oversubscribed, and the price range has been raised twice.
And the deal has the kind of institutional frenzy that makes portfolio managers cancel their golf games.
But the real insight behind Cerebras Systems isnât making a faster chip but rather refusing to cut the wafer at all.
The most obvious idea nobody tried
The process of making semiconductors has remained largely unchanged for 60 years.
You take a silicon wafer about the size of a dinner plate, each with hundreds of tiny chips on it with light, then dice it up like a pizza.
Each chip is used in a phone, a server, or a graphics card.
And every chip company on earth does it this way: Nvidia, AMD, Intel.
A Cerebras engineer looked at this process and had an idea so simple itâs almost annoying.
What if you just didnât cut it?
The result is the Wafer-Scale Engine, or WSE, and it is, without exaggeration, the largest chip ever built.
Weâre talking 46,225mmÂČ versus the 814mmÂČ of Nvidiaâs H100, and thatâs 56 times bigger than the most powerful GPU on the planet.

The reason this matters comes down to one word, bandwidth.
Every time an AI model generates a word, it has to reach into memory, grab a bunch of numbers called weights, multiply them together, and spit out a prediction.
Do that a thousand times per second, per user, across millions of concurrent sessions, and the bottleneck isnât computing power but rather how fast the chip can move data from memory to the processor.
NVIDIAâs H100 moves data at roughly 3 terabytes per second, while the WSE-3 moves data at 21 petabytes per second, around 7,000 times faster.

The reason is simple: when memory and compute live on the same enormous chip, data barely needs to move.
One of the biggest bottlenecks in AI today is the memory wall, where models constantly shuttle data back and forth between memory and processors, creating delays and inefficiencies.
Cerebras Systems reduces that problem by keeping everything on a single wafer-scale chip, which becomes incredibly valuable in the inference era, where millions of AI requests are being processed nonstop.
Why timing is everything
For the last four years, the AI industry has been obsessed with training.
Training models that cost $100M and take six months on ten thousand GPUs.
Thatâs the game Nvidia has completely dominated, and will continue to dominate for the foreseeable future.
But something is shifting.
OpenAI has already trained GPT-5, Google has Gemini Ultra, and Meta has Llama 4.
The phase of AI to build the biggest brain possible is largely done for now and what comes next is inference.
Every ChatGPT question, summarized document, and AI agent task runs on inference, a nonstop 24/7 workload powering the modern AI economy.
The inference market is projected to grow from $106B in 2025 to $255B by 2030, and thatâs the whole game.
This is the moment Cerebras has been building toward, and they showed up to the IPO window right as the industryâs center of gravity is moving in their direction.
Why memory stocks are ripping?
Hereâs something thatâs hiding in plain sight right now.
While everyoneâs been debating the Cerebras IPO, a signal has been flashing in the memory chip market.
Micron (MU) just posted its strongest earnings in years, and SK Hynix is hitting record highs on the back of HBM3E demand.
The narrative everyoneâs running with is that this is purely an Nvidia story, more GPUs shipped means more high-bandwidth memory needed.
Thatâs true, but thereâs a second-order read that almost nobody is writing about.
When memory prices go up, it doesnât just benefit the companies selling memory, but it also makes the argument for alternatives to memory-heavy architectures much more compelling.
NVIDIAâs H100 uses 80GB of HBM3 memory, the expensive, power-hungry, increasingly scarce kind.
Every server you build around H100s is a server youâre filling with the most contested component in the AI supply chain.
This is exactly why I took a massive position in MU about a month ago, and that position has already gone up over 100% as the market finally started realizing memory was becoming one of the real bottlenecks of the AI economy.

Source: Milk Road PRO
If you want to see how weâre positioned around the AI infrastructure trade, you can come join Milk Road PRO.
Cerebrasâ WSE-3 doesnât use HBM at all but rather has 44GB of on-chip SRAM baked directly into the wafer.
It has no external memory stacks, HBM3 supply chain exposure, or SK Hynix or Micron pricing risk hitting your data center costs every quarter.
So as MU and SK Hynix go higher, two things happen simultaneously.
NVIDIA-based infrastructure gets more expensive to build and operate, and the pitch for Cerebras, a chip that sidesteps that cost entirely, gets easier to make to every CFO in the room.
CRYPTO SHOULD WORK HARDER FOR YOU
Most people hold crypto and hope.
The smart money? They're earning interest on it, borrowing against it without selling, and trading it.
Where can you do the same all in one place? Nexo.
And right now, new U.S. clients get 30 days of Wealth Club Premier (benefits normally reserved for loyalty program members):
- Enhanced interest rates on your digital assets
- Lower borrowing costs against your crypto
- Up to 0.5% cashback on trades
No need to sell to access liquidity. No juggling 5 different platforms.
*Disclaimer: Geographic restrictions and terms apply.

THE CHIP THAT REFUSED TO BE SMALL (P2)
Think about it from the hyperscalerâs perspective.
Youâre Amazon, Microsoft, or Google, and youâre trying to build out inference capacity at scale.
Every quarter, your GPU cluster costs go up not just because Nvidia raises prices but because the HBM memory inside each server is getting more expensive too.
And the memory pricing data is starting to look parabolic, even after already climbing for more than six straight months.
South Korea DRAM export prices recently jumped another +35%, flash memory prices surged +47%, and SSD pricing spiked nearly +140% in just a matter of weeks.
That is exactly the kind of supply chain inflation hyperscalers hate seeing when theyâre planning multi-billion-dollar AI infrastructure budgets.

Micronâs latest guidance suggested HBM pricing could stay elevated through at least 2026 and into 2027.
That cost pressure is exactly what pushes procurement teams to start evaluating alternatives.
Not as a replacement for everything theyâve already built on CUDA, but as a dedicated inference layer where the economics make more sense.
AWS already figured this out, their Bedrock deployment with Cerebras isnât a science experiment but rather a hedge against the HBM pricing cycle hitting their inference margins.
For investors watching CBRS trade on its first day, this is the macro setup that doesnât show up in the S-1 but absolutely shows up in the order pipeline.
A rising memory tide doesnât lift all boats equally, but sometimes it makes people start looking very seriously at the boat that doesnât need the water.
Now hereâs where we stop being impressed and start being honest.
Strip away the engineering marvel and the oversubscription frenzy, and a few things stand out.
86% of 2025 revenue came from two entities with UAE ties, while U.S. revenue fell 34% to $187M.
And the $20B compute infrastructure deal OpenAI signed with Cerebras Systems, while transformative, is still a conditional contract.
If Cerebras misses key delivery milestones tied to the buildout, OpenAI can terminate the agreement, which could also trigger repayment demands tied to the companyâs $1B loan facility.
In other words, Cerebrasâ biggest growth driver is also one of its biggest financial risks.
The governance structure is complicated, too.
Sam Altman is simultaneously a personal investor in Cerebras, the CEO of its largest customer, and connected to the company as a creditor through the OpenAI financing arrangement.
During the OpenAI vs. Elon Musk lawsuit, co-founders were accused of not fully disclosing those personal investments when procurement decisions were made.
None of this has been proven illegal, but it is exactly the kind of relationship structure that makes institutional investors and compliance teams nervous.
And yet despite all of those risks, the market is still willing to value Cerebras Systems at roughly 91x trailing revenue. That is dramatically richer than peers like Nvidia, AMD, and even Arm Holdings.
So what exactly are investors betting on?
OpenAI is not running workloads on Cerebras hardware because Andrew Feldman (CEO of Cerebras) sent Sam Altman a nice holiday card.
Theyâre doing it because inference speed matters, and a jump from roughly 150 tokens per second to 2,000 tokens per second is the kind of gap that makes agentic AI actually work in real time.
Amazon Web Services integrated Cerebras into Bedrock for a similar reason.
Their disaggregated inference architecture reportedly delivers around 5x more capacity in the same physical footprint.
And that directly impacts data center economics, power usage, and infrastructure efficiency.
And then thereâs the backlog.
Cerebras ended 2025 with roughly $24.6B in remaining performance obligations.
For a company with just over $500M in annual revenue, that is an enormous number, which means the company theoretically has years of growth already contracted.
The demand is there, the contracts are signed, and now Cerebras just needs to prove it can deliver the hardware at scale.
The honest verdict
Cerebras is a real technology company solving a real problem at exactly the right moment in the AI cycle.
The customer list, the backlog, and the architectural advantage are all genuine.
Day one pop? Almost certain, this stock will likely run up into the 200s. I will decide whether to buy or not based on the morning price action, which will be alerted to our PRO members on Discord, so you make sure you are in there.
The 3-year thesis? Thatâs a bet on whether $24.6B in backlog converts to diversified revenue before the market starts asking harder questions about the financials.
High conviction for inference supercycle believers. A lot to swallow for everyone else.
And if you want our full breakdown on which upcoming IPOs weâre buying, avoiding, trading, or waiting on, you can check out the detailed PRO report here.
Alright, thatâs it for this edition of Milk Road AI. We want to hear from you.
Reply to this email with your vote:
- Bullish: Real technology, real backlog, buy the dip after the pop.
- Skeptical: The customer concentration and falling U.S. revenue disqualify this at any price.
- Wait and see: Come back when U.S. revenue is actually growing again.

Midnight is a fourth generation blockchain that just launched. Check out their launch announcement here.
Real Finance Blockchain is an EVM-compatible L1 that is built specifically for RWA tokenization. Read more about Real Finance Blockchain here.
Nexo is back in the U.S. - and new clients get 30 days of Wealth Club Premier perks! Higher yields, lower borrowing rates, and crypto cashback - start here.

BITE-SIZED COOKIES FOR THE ROAD đȘ
NVIDIA tops $40B in equity bets as AI investment pace goes into overdrive. NVIDIA is no longer just a chipmaker, it's now the most aggressive bet-placer in AI.
OpenAI adds real-time voice translation across 70+ languages to its developer API. Developers can now build live, multilingual voice apps with GPT-5-class reasoning baked in.
Google teases Android 17 and a major Gemini AI overhaul days before I/O 2026. Gemini 4 is rumored to be a massive leap and Google I/O on May 19 is the big reveal.

MILKY MEMES đ€Ł


ROADIE REVIEW OF THE DAY đ„
















