Hebo AI

Latest Thoughts

AI Sovereignty is not just Data Residency

We have been hands-on implementing AI across different teams and departments, and realized that we have hit an inflection point in model capabilities.

A few thoughts have been bouncing in our minds, that you might find interesting:

Not even two years ago, the first version of DeepSeek created a big buzz: could open source models beat proprietary labs that were investing billions of dollars into R&D?

The buzz faded quickly, and for the longest time, the answer seemed to be "no." While the models kept improving, the gap always remained too large. But that tide has turned this year, with nearly a dozen open models now competing head-to-head with proprietary ones in the Top 20 leaderboard on OpenRouter.

OpenRouter leaderboard showing open models among top-ranked AI models

For example, Alibaba's Qwen 3.6 models are now reaching capability levels comparable to recent Claude and ChatGPT versions, at a fraction of the model size, and with that, a fraction of the cost. They are efficient enough that developers can even run them on their own laptops.

This should force countries to rethink their AI agendas. So far, too many have focused only on "data residency", attracting massive data center investments while outsourcing operations to US hyperscalers who capture fat margins of 50%+. The servers may sit locally, but the profits, control, and leverage remain elsewhere.

As AI agents begin taking on more and more work previously done by knowledge workers, for example autonomously coding for hours, this starts to look less like cloud infrastructure and more like a foreign tax on domestic productivity. Every token sent, every workflow automated, means value leaves the country.

Most of these data center projects are still unfinished. Right now, in the majority of countries, none of the models from OpenAI, Anthropic, or Google can actually be run within their own borders. Not even older versions. So what is the real alternative?

Developers naturally choose platforms like OpenRouter or Vercel AI Gateway because they offer a five-minute quick start and some free tiers. But these are also US platforms, and once your stack is built around them, there are very few reasons to leave. Enterprises face the opposite trap: long procurement cycles through system integrators, usually attached to telcos, while paying heavy markups on top of already expensive token fees.

Yes, the new data centers will eventually open. Give it 12-18 months, and countries will be able to run the latest versions of ChatGPT, Claude, and Gemini within their own borders.

But if the entire stack, the models, developer ecosystem, orchestration layer, and margins, still belongs to someone else, then local hosting changes very little. You still pay the tax, just more quietly.

It is common practice for cloud companies like Amazon, Microsoft, and Google to hand out credits to startups, helping them avoid the traditional capex required for servers when starting a business. Historically, this was a win-win situation, because these systems still required large local engineering teams to build on top of.

More recently, those same credits are being used for AI agents doing real work like coding. Startups downsize their teams or hire fewer staff in the first place. Hyperscalers are effectively financing payroll replacement. And we are not talking about a few hundred dollars. We are talking hundreds of thousands of dollars per company.

The old narrative that startups naturally create local jobs no longer holds in the same way.

Open models create a rare window to change these dynamics. You no longer need the world's best PhDs in machine learning and neural networks to participate. Innovation is moving up the stack. It is about orchestrating models and building great user experiences. It is about operating these systems efficiently at scale, based on fast-changing usage patterns. And it is about building local capabilities: local developers, local platforms, local companies, and local ownership of the value chain.

Nowadays, you can literally describe what you want to build in plain language, like talking to a colleague. Yes, it really works, it is not just a cool demo. Prompting is becoming the new programming. Everyone can prompt. But while you do not need a Computer Science degree anymore to build things, the economic equation leaves a huge amount of talent behind.

The average person thinks twice before signing up for the $100/month that AI companies nowadays charge for access to full capabilities and reasonable token budgets. Even small teams easily spend $100 per day on AI usage.

And although the large language models run in the cloud, the agent still runs on your own machine. This requires a computer not older than a couple of years, with a preference for Apple's Mac devices.

That is why 95% of people are still using the free version of ChatGPT or Gemini and do not even realize what modern AI is actually capable of.

This is where open models matter most.

Running an efficient open model instead of the latest frontier proprietary model can reduce inference costs by 2 to 5 times. Cutting out the hyperscaler layer and operating directly on local infrastructure can reduce costs by another 50%.

Model cost comparison showing lower inference costs for efficient open models

Combined, this is not a small optimization. It can be the difference between AI being accessible only to a small elite vs being usable by everyone.

The countries that make these tools cheap, available, and deeply integrated into education and business will compound faster. Their students will learn faster, their startups will iterate faster, and their companies will operate with higher output.

AI literacy will become economic infrastructure, just like electricity or broadband. Some countries will use this moment to build real sovereignty. Others will finance someone else's empire.

Back home