Challenges

Google Cloud Next 2026: $200B in Capex Doesn't Buy Production Maturity

By Marc Molas·May 11, 2026·9 min read

Alex Scroxton's piece in Computer Weekly on Google Cloud Next 2026 lands on the right anxiety — the industry is producing too much AI slop and not enough AI value — but it stops one floor short of where I want the conversation to happen. As a DevOps engineer who actually has to put these stacks into production at enterprise scale, my problem with Cloud Next isn't the Gemini-enhanced DJ set. It's the gap between the keynote and the runbook.

Thomas Kurian opened the event with a line that should make any SRE flinch: "You have moved beyond the pilot, the experimental phase is behind us." He's backed by a real number — three quarters of Google Cloud customers already use Google's AI in some form — and a real cheque: roughly $200 billion of capex committed to AI infrastructure. New silicon (TPU 8i for inference, TPU 8t for training). A formally unveiled Gemini Agent Platform. The Capcom case study: a multi-agent workbench running 30,000 human hours a month of work. Citi Sky, a bilingual wealth advisor built on the full Google AI stack.

That's the marketing package. The technical package, the one nobody at Moscone wanted to talk about on the main stage, is the operational bill that comes with the words "in production."

The slop critique is right, but it's not the load-bearing problem

Scroxton's central worry is creative: AI-generated music, AI-generated art, the cultural cost of replacing humans with statistical mediocrity. Fair, and I share most of it on a personal level. But from where I sit, slop is downstream of a more dangerous decision: shipping AI workloads into shared production environments without the operational maturity those environments demand.

Slop dilutes a Spotify playlist. Bad AI in production at a bank dilutes the audit trail. One is annoying. The other is a regulator's open invitation.

The hype-vs-reality framing the Computer Weekly piece chooses is the artistic one. I want to insist on the engineering one, because that's where the bodies will turn up.

"Beyond the pilot" is not a vibe. It's a discipline.

When Kurian says the experimental phase is behind us, the honest translation is: a lot of customers have a Vertex AI notebook, a Gemini API key, and at least one feature flag pointing real traffic at a model. That is not the same thing as "in production" the way a payments team or an SRE team uses the term.

To put a model in production, in the sense I'd defend in front of a board or a regulator, you need at minimum:

Observability that understands the workload. Latency p99 per model and per route. Token cost per request. Cache hit rate. Eval drift on a frozen benchmark. None of that comes "for free" with a Gemini Agent Platform deploy. You build it.
Cost attribution down to the team. If three pods share an inference endpoint, who pays for which spike? With $200B of capex sitting upstream of you, the cloud bill stops being a finance footnote and becomes a primary engineering concern.
Incident response that knows about non-deterministic systems. A model that was correct yesterday and wrong today isn't a deploy bug. It's a behavioural regression with no clean rollback target. Your runbook needs to reflect that.
Governance that survives an audit. Lineage of which prompt template produced which decision, against which model version, with which retrieval context. If you can't answer that under deposition, you don't have a production system. You have a demo with traffic.

None of these were the headline at Cloud Next. They were assumed away. That's the gap.

The Capcom number, read like an SRE

Capcom's workbench is genuinely impressive: a visual inspection agent on Gemini Vision, a predictive agent on historical data, an institutional-knowledge agent, a data-inefficiency agent. Result, per the keynote, is "30,000 human hours every month."

Read that as a DevOps engineer, not as a CMO.

Thirty thousand hours a month means roughly 170 full-time equivalents of agent work running continuously against production data. The questions I'd want answered before I sign that off:

What's the SLO on each agent? Not the LLM provider's SLA. The composite SLO end-to-end, including retrieval, post-processing, and the human review loop.
What's the failure mode when the predictive agent is silently wrong for a week? Predictions don't crash. They drift. Do you have an offline eval running on a frozen golden set, and an alert when the agent's outputs diverge by more than X%?
Who's on call for the agent stack? If pod A's data-inefficiency agent is starving pod B's retrieval index of cache, who pages whom? "The platform team" only works if the platform team exists and has authority.
What's the rollback path? Models are versioned, but prompt templates, retrieval indices, and tool definitions usually aren't. A bad change in any of those three can degrade quality without tripping a single classic deploy alarm.

None of this is exotic. It's the boring SRE checklist that has been the difference between "we use AI" and "AI works for us" for two years. The keynote tone — the experimental phase is behind us — risks pushing customers to the first quadrant while skipping the second.

The agent platform stack: more layers, same on-call

The Gemini Agent Platform sits on Vertex AI sits on TPU sits on the new 8i/8t generation. From a buyer's perspective that looks like an integrated story. From a DevOps perspective it looks like four more layers in the dependency graph, each with its own failure mode, its own quota system, its own pricing curve, and its own console where an incident can hide.

Pair that with the genuine multi-cloud reality of most enterprises and you get a familiar pattern: a Gemini Agent integrating with workloads on AWS, with data in Snowflake, with auth in Okta, with observability split across Datadog and a self-hosted Grafana. The neat slide at Moscone hides a hairball.

Two consequences I'd be paying attention to as a CTO right now:

Vendor lock-in is no longer a procurement question, it's an operational one. Once your prompt templates, eval suites, and agent orchestration live inside one vendor's platform, the migration cost stops being measurable in licensing and starts being measurable in months of SRE work.
Capex commitments are bets you make on someone else's behalf. When a hyperscaler announces $200B in capex, the implicit promise is that prices stay favourable while utilisation ramps. The implicit risk is that, three years in, the unit economics force a pricing reset that lands directly on your platform team's roadmap.

Neither of those is doom. Both are the kind of risk you build into a three-year platform plan when you write it honestly.

Citi Sky and the part of the announcement I actually endorse

The case study I'd hold up against the slop critique is Citi Sky. A bilingual wealth advisor, built on the Google AI stack, explicitly framed as augmenting human advisors, not replacing them. That framing is the one I keep coming back to in every post on this site: AI is implementable and valuable when it expands what an expert can do per hour, not when it tries to substitute for the expert.

Citi Sky also drops a quieter signal worth picking up: a regulated financial institution doesn't bet a wealth advisory product on AI without serious controls underneath. Whatever the keynote showed, the team behind it has data lineage, decision logging, model-risk-management review, and a human-in-the-loop pattern they can defend to the OCC. That's the part of the iceberg the conference doesn't put on the slide.

If your AI initiative doesn't have an equivalent iceberg, you don't have a Citi Sky. You have a chatbot with a brand.

What I'd want from next year's keynote

If Google Cloud Next 2027 wants to convince the people who actually keep these systems up, here's what I'd put on the main stage instead of a Gemini DJ:

Production patterns for agentic systems with named SLOs. Not "we run 30,000 hours a month." Show me p99 latency, eval drift bands, and the human review rate, and tell me which numbers are non-negotiable.
A first-class cost-attribution story. Per-team, per-agent, per-route. With chargeback primitives in the platform itself, not bolted on by every customer.
Honest failure modes for the Agent Platform. What happens when retrieval is stale? When tool calls loop? When a downstream API rate-limits an agent mid-conversation?
A reference operating model for platform teams. Sizing, on-call rotation, the split between agent-product engineers and platform engineers. The Coinbase pitch of pure solo operators is one extreme; the unspoken assumption that the hyperscaler will absorb operational complexity for you is the other. Both are wrong.
A grown-up conversation about evals. As Microsoft's own ActionNex paper showed for AIOps, the state of the art on real incidents is around 53% recall. That's not a number you put behind a wealth advisor without a very thick human-review layer. The keynote tone should reflect that, not paper over it.

The line I'm drawing

Scroxton is right that the industry is producing too much slop. But slop is a creative-economy problem. The infrastructure problem — and it's the one I have to live with — is that "in production" has been quietly redefined to mean "calling an LLM from a request path that already serves users." Those are not the same. The first requires SLOs, observability, cost attribution, governance, and incident response designed for non-deterministic systems. The second requires an API key.

Two hundred billion dollars of capex buys a lot of TPUs. It doesn't buy a platform team. It doesn't buy a runbook. It doesn't buy the operational maturity that decides whether the AI in your stack is a productivity multiplier or a latent incident waiting for its first bad Friday.

The companies that win the next two years of AI won't be the ones that adopted fastest. They'll be the ones whose platform teams refused to call something "production" until it actually was.

Sources:

Alex Scroxton, Google Cloud Next: It's time to create value, not slop, from the AI boom, Computer Weekly, April 23, 2026. computerweekly.com
Google Cloud Next 2026 keynote, Thomas Kurian — figures on customer adoption, TPU 8i/8t, Gemini Agent Platform.
Capcom and Citi case studies as presented at Google Cloud Next 2026.

Putting AI into production and unsure whether your platform can carry the operational weight? Talk to a CTO — we'll help you separate the keynote from the runbook.