AI Sovereignty Isn't Data Residency. It's Megawatts, Fiber and Wet-Bulb Temperature. (1/3)
This is post 1 of 3 in a series on Sergio Cruzes' AI Infrastructure Sovereignty paper. Part 2 covers the Feasible Sovereign Operating Region; part 3 covers the LLM-as-advisor architecture.
A few weeks ago Sergio Cruzes (Ciena) put a paper on arXiv called AI Infrastructure Sovereignty (2602.10900v4). It's the kind of paper that doesn't go viral on LinkedIn because it doesn't promise anyone a 10x productivity boost, but it should be read by anyone signing off on an "AI sovereignty strategy" this year. Its central claim is simple and uncomfortable: AI sovereignty is no longer a software or a legal problem. It's an infrastructure problem. Data localisation clauses, regional cloud regions and GDPR posture are necessary but nowhere near sufficient. Real sovereignty lives in megawatts, fiber routes and the wet-bulb temperature outside your data hall.
From where I sit — building DevOps and platform engineering for companies that have to defend their AI stack in front of a regulator — this reframing is overdue, and most public sovereignty roadmaps I've reviewed in the last twelve months are still operating one layer too high.
Legal sovereignty vs operational sovereignty
The paper makes a distinction that should become standard vocabulary:
- Legal sovereignty is the layer everyone already understands: jurisdiction, compliance, IP protections, data localisation frameworks. GDPR, the EU AI Act, the CLOUD Act, SecNumCloud-style labels. Lawyers and procurement live here.
- Operational sovereignty is "the practical ability to observe system state, make decisions based on local conditions, validate those decisions, and act within defined policies and physical limits." Engineers and operators live here. Or rather, should live here. In most sovereignty conversations, this layer is missing.
The two layers are not the same thing, and one without the other is window dressing. Legal control without operational capability is nominal. You can hold the contract, the audit reports and the data-residency clause and still depend on someone else's hardware, someone else's optical network and someone else's power purchase agreement to actually run the system. When that someone else is constrained by export controls, sanctions, or a unilateral platform decision, your sovereignty evaporates regardless of what the contract says.
This is the part that most "EU sovereign cloud" conversations elegantly skip. We discuss where the bytes live and ignore where the joules come from.
What "infrastructure" actually means in the paper
Cruzes is unusually concrete for a sovereignty paper. The three layers he treats as substrate aren't abstractions:
-
AI data centers — racks pushing past 20–30 kW (the air-cooling ceiling) into liquid cooling as standard. Training clusters demanding tens to hundreds of megawatts per site. Power that doesn't draw a flat curve but spikes during collective operations on the millisecond-to-second scale. None of this is exotic in the hyperscaler world; almost none of it is locally controlled in the European sovereign clouds I've reviewed.
-
Optical networks — the part most software-centric sovereignty discussions ignore entirely. The speed of light gives a hard floor of ~5 ms per 1,000 km. Training clusters can tolerate about 1 ms of collective-comm latency, which translates to a geographic radius of around 100 km. Inference is more forgiving but still has to sit near demand. Submarine cables — the ones that carry most intercontinental AI traffic — are "difficult to repair. Disruptions can last weeks." Sovereignty of a fiber path is not a contract clause; it is the physical right-of-way and the boat you need to fix it.
-
Energy systems — grid capacity, carbon intensity, water for cooling. The paper proposes a Feasible Sovereign Operating Region (FSOR): the intersection where energy availability, carbon intensity and water budget can be satisfied jointly. I'll come back to FSOR in the next post because it deserves its own treatment. The point here is structural: a region without enough grid headroom, with a high-carbon energy mix or seasonal water stress is not sovereign for frontier AI regardless of how good its data-protection laws are.
Once you put it that way, you can't seriously claim sovereignty for an AI workload running on imported accelerators, on an optical network operated by a third party, drawing power from a grid you don't control, cooled by water you don't price. You can claim something, but not sovereignty in the operational sense.
Why the cloud-region narrative breaks here
If you take the cruzes definitions seriously, the dominant European sovereignty narrative — "we'll have sovereign cloud regions operated by EU entities" — solves at most one of three layers, and arguably less than half of that one. A sovereign region implemented on:
- Imported accelerators under foreign export-control regimes,
- Optical capacity leased on transit providers headquartered elsewhere,
- Power purchase agreements with no operational telemetry into the grid,
- Cooling water from a basin under no local stewardship,
is a region in the legal sense and a tenant in the operational sense. The supervisor will accept the legal layer. The physics will not.
I'm not saying the cloud-region effort is wrong. I'm saying it solves the part of the problem that lawyers can verify and leaves untouched the part that engineers and operators have to live with. The first time a cross-border control point — export sanctions, a platform vendor's licensing change, a submarine fiber outage — hits the region, the gap between "legally sovereign" and "operationally sovereign" becomes the only thing that matters.
The telemetry layer is the actual sovereignty layer
The most underrated technical claim in the paper is this: operational sovereignty depends on cross-layer telemetry fusion, and the entity that performs that fusion holds the real keys.
Today there are four distinct protocol ecosystems that have to be joined to produce a unified state representation of an AI data centre:
| Domain | Protocols | Maturity |
|---|---|---|
| Optical networks | OpenConfig / gNMI | Highest |
| Compute / power | Redfish, IPMI, vendor BMC APIs | Heterogeneous |
| Cooling / facilities | BACnet, Modbus | Facilities-grade, not real-time |
| Grid sustainability | WattTime, Electricity Maps (~5-min updates) | Commercial, external |
These four were not designed to talk to each other. They emit different schemas, different cadences, different freshness guarantees. Joining them into a single state vector — Cruzes calls it θ(t) — is not commodity work. It is the work that determines whether you can detect a cascading failure (power spike → thermal event → workload migration → network congestion) before it surprises you, or whether you read about it in the post-mortem.
And here is the operational-sovereignty kicker: whoever owns the telemetry fusion layer owns the actual control surface. If you delegate that work to an external platform — a hyperscaler's "AI infrastructure suite", a vendor's bundled observability product — you have delegated operational visibility along with it. Your dashboard says "everything green." You no longer know what would turn it red, on whose terms, or with what latency.
I'd put this more bluntly than the paper does: telemetry fusion is the new system of record for AI infrastructure, and most European operators don't have one. They have dashboards built on someone else's pipeline. Which is fine until it isn't.
What this means for a regulated client this year
For the kind of client I work with — banks, healthcare, energy, public sector — translating this into a practical posture means dropping a couple of comfortable assumptions and adopting three uncomfortable ones:
-
Drop the assumption that data residency is the sovereignty conversation. It's the easiest layer, the one your DPO can already explain, and the one a competent adversary or a regulatory accident bypasses fastest. It belongs in the answer, not as the whole answer.
-
Adopt the assumption that you need an inventory of physical dependencies. For each AI workload in production or planned: which accelerators (and under whose export regime), which optical paths (and whose right-of-way), which power source (and whose PPA), which cooling resource (and whose water rights). Most teams cannot answer this for their existing stack today. The first job is the inventory, not the architecture.
-
Adopt the assumption that telemetry fusion is your responsibility. Even if you operate on top of someone else's iron, you can — and in regulated sectors you should — own the layer that fuses your operational signals into a state representation you can reason about, audit, and present to a supervisor without translation. Without that, your incident reports will always be written in your vendor's vocabulary, on your vendor's timetable.
-
Reclassify the system, not the model. I keep saying this, including in my read of McKinsey's 2026 AI trust report: the regulator wants the full sociotechnical system classified, which now explicitly includes the physical layer. The Article 9 risk-management file for a high-risk AI system that doesn't acknowledge the energy, optical and cooling dependencies is incomplete by the paper's own logic.
-
Accept that sovereignty is a spectrum, not a binary. Cruzes is clear about this: no region achieves absolute sovereignty. All of them exist somewhere on a curve defined by which layers are locally controlled and which external dependencies are accepted. The honest sovereignty roadmap is the one that names the dependencies and prices them, not the one that claims to remove them all.
The part I'm critical of in the AI hype around sovereignty
I'm on record as positive about LLMs and agentic systems in production — I deploy them, I bill for them, my clients pay for them, my own time is "invested" in them in the most literal sense. I'm not the person who needs to be convinced AI is real.
That said, the public sovereignty discourse around AI today has a specific failure mode that the paper makes legible: it treats sovereignty as a content problem and not as a substrate problem. The pitch is "your data stays in-country," and the unspoken corollary is "everything beneath your data is someone else's problem." Cruzes shows that the substrate is not someone else's problem; it is the problem, because the substrate is what an external actor can actually withhold.
The hype version of sovereign AI is a model trained on local data, hosted in a local cloud region, marketed under a local flag, running on the same imported silicon and the same long-haul optical capacity as everyone else's stack. The paper's version of sovereign AI is the same model, but with the question under which physical constraints can I keep operating if my external dependencies are revoked? honestly answered. The first version is a slide. The second version is a runbook.
If your sovereignty programme can answer the slide and can't answer the runbook, you are precisely where the rest of the industry is. The work this year is to flip those two states.
What I'd put on the platform roadmap this quarter
For a platform or infrastructure team in a regulated sector, three concrete moves before the next board update:
-
Map the physical dependency stack for every AI workload in production. One table, four columns: accelerator family + regulatory regime, optical path + operator, power source + PPA terms, cooling resource + local stewardship. The table will not be pretty. That's the point.
-
Start a telemetry-fusion baseline, even if rough. Pick a single workload, pull OpenConfig from your optical layer, Redfish from compute, whatever you can get from facilities, and a commercial carbon-intensity feed. Build a 60-second-resolution θ(t) for that workload. You will discover an embarrassing amount of unknown unknowns. That is the value of the exercise.
-
Write a one-page sovereignty memo that distinguishes legal from operational sovereignty for your CFO and your supervisor. Even if all you can deliver this quarter is the legal column, owning the vocabulary keeps the board conversation honest. I'd rather walk into a supervisor meeting saying "we control layers 1 and 2, we depend on vendor X for layers 3 and 4, here is the contingency" than walk in claiming a sovereignty I don't have.
The line I'm drawing
The paper's framing — legal vs operational, with physics as the binding constraint — is the right one to hold internally even when the marketing version of sovereignty is what gets quoted externally. AI is real, deployment is accelerating, the value is genuine. None of that changes the fact that the layer at which sovereignty actually decides has moved underneath most of the industry's current vocabulary.
If your sovereignty programme this year still terminates at the data-residency clause, you are not wrong. You are just one layer too high. The interesting work — and the work a regulator will care about within the next twelve months — is happening below your dashboard.
Sources:
- Sergio Cruzes (Ciena Corporation), AI Infrastructure Sovereignty, arXiv:2602.10900v4, April 2026. arxiv.org
Building AI infrastructure that has to defend itself in front of a regulator and unsure where your sovereignty programme actually terminates? Talk to a CTO — we'll help you separate the legal layer from the operational one before someone else does it for you.


