DigitalTransformation on Carles Abarca

From Cloud Lock-In to Cognitive Lock-In

Mon, 08 Jun 2026 00:00:00 +0000

For years, enterprises worried about cloud lock-in. AI introduces a deeper dependency: cognitive lock-in.

For the last decade, technology leaders have understood the strategic risk of cloud lock-in. Once core infrastructure, data platforms and business applications are deeply embedded in a single hyperscaler, switching becomes expensive, slow and operationally painful.

But artificial intelligence is introducing a more subtle and potentially more important form of dependency: cognitive lock-in.

When an organization builds critical processes entirely on top of external AI APIs, it is not only outsourcing compute. It is outsourcing part of its reasoning, classification, summarization, decision support and customer interaction capabilities. In other words, it is outsourcing pieces of how the organization thinks.

That does not mean external APIs are bad. Quite the opposite. Frontier models from OpenAI, Anthropic, Google and others have accelerated AI adoption dramatically. They remain essential for experimentation, advanced reasoning and high-complexity tasks.

But as AI moves from pilots to production, organizations need to ask a more strategic question:

Which parts of our intelligence layer can we afford not to control?

The first wave: buying intelligence through APIs
#

The first wave of generative AI adoption was API-first, and for good reason.

APIs made it possible to experiment quickly. No infrastructure. No model training. No GPU procurement. No specialized machine learning operations team. A developer could build a prototype in days. A business unit could test a use case in weeks. A digital transformation team could prove value without waiting for a multi-year platform program.

This was the right way to start.

For innovation, speed matters. For discovery, flexibility matters. For early adoption, removing friction matters.

But what works for experimentation does not always work for operational scale.

When AI becomes embedded in core workflows — customer support, internal knowledge management, document analysis, software development, financial operations, academic services, compliance review, sales enablement — the economics and risks change.

The question is no longer only: Does the model work?

The questions become:

How much does this cost when usage grows 10x?
What happens if the provider changes pricing?
What happens if usage limits are reduced?
What happens if a model is deprecated?
What happens if latency becomes unacceptable?
What happens if our data cannot leave our environment?
What happens if a vendor decision breaks a critical process?

At that point, AI is no longer just a tool. It becomes part of the operating model.

The cost problem is not only price. It is uncertainty.
#

Many organizations evaluate AI cost as a price-per-token problem. That is useful, but incomplete.

The bigger issue is not the absolute price of an API call today. It is the uncertainty of depending on an external pricing model for an internal process that may become mission-critical.

A proof of concept can tolerate variable costs. A production process cannot.

If an AI service is used occasionally by a small group of employees, an API-first approach may be perfectly fine. But if that service becomes part of thousands or millions of interactions — every student query, every customer support case, every internal search, every document classification, every automated summary — cost predictability becomes a strategic requirement.

Organizations need to know not only whether AI is powerful, but whether it is financially governable.

This is where open source and open weight models become important.

Not because they are always better. Not because they will replace frontier models. But because they can provide cost certainty for high-volume, repeatable and well-defined workloads.

A privately deployed model has infrastructure costs. It has operational costs. It requires governance. But it also allows an organization to understand, forecast and optimize the cost of intelligence under its own rules.

That matters.

Cognitive lock-in is deeper than cloud lock-in
#

Cloud lock-in was about infrastructure dependency.

Cognitive lock-in is about intelligence dependency.

The difference is profound.

When a company depends on a cloud provider, it depends on infrastructure: compute, storage, networking, managed services, databases. These are critical, but they are still mostly execution layers.

When a company depends entirely on external AI APIs, it may depend on them for cognitive functions:

summarizing information,
classifying documents,
prioritizing work,
recommending actions,
generating responses,
extracting insights,
detecting anomalies,
supporting decisions,
interacting with customers, employees or students.

These are not peripheral capabilities. They are part of the organization’s intelligence layer.

If every business process calls the same external AI API, that API becomes part of the operating model. If pricing changes, the operating model changes. If performance changes, the operating model changes. If access is restricted, the operating model is exposed.

This is why cognitive lock-in deserves executive attention.

It is not a technical architecture concern only. It is a strategy, risk and governance concern.

Open source AI is not a cheap Plan B
#

There is a common misunderstanding about open source AI: that its main value is being cheaper.

Cost matters, but that is not the full story.

The strategic value of open source and open weight models is control.

Control over where the model runs. Control over which version is used. Control over what data is sent. Control over logs and observability. Control over latency. Control over tuning. Control over when to upgrade. Control over how performance is measured. Control over how costs scale.

For many enterprise use cases, the best model is not necessarily the most powerful frontier model available. The best model is the one that reaches the required quality threshold with the right cost, latency, privacy and governance profile.

A smaller model deployed privately may be more appropriate than a frontier API for tasks such as:

internal knowledge search,
document classification,
meeting summarization,
support ticket triage,
policy Q&A,
form extraction,
routine code assistance,
translation,
controlled content generation,
domain-specific assistants.

In these scenarios, sovereignty is not ideological. It is practical.

The goal is not to reject commercial AI platforms. The goal is to avoid blind dependency.

The mature AI stack will be hybrid
#

The future of enterprise AI will not be single-model.

It will be hybrid, governed, cost-aware and sovereign.

Frontier APIs will remain essential. They will be the best option for complex reasoning, advanced creativity, new use cases, low-frequency high-value tasks and situations where state-of-the-art performance matters more than cost.

Open source and privately deployed models will become essential for high-volume workloads, sensitive data, predictable costs, domain-specific tasks and operational resilience.

Specialized models will serve narrow use cases where precision, latency or domain knowledge matters more than general capability.

The real enterprise capability will not be choosing one model. It will be building a governed intelligence layer that routes each task to the right model based on cost, risk, quality and data sensitivity.

A mature AI architecture will ask:

Is this task sensitive?
Does it require frontier-level reasoning?
What is the acceptable error rate?
What is the cost per successful task?
Can a smaller model do this well enough?
Does the data need to stay inside our environment?
Do we need auditability?
Do we need cost certainty?

This is where AI strategy becomes architecture.

AI sovereignty is not isolation
#

AI sovereignty does not mean building everything internally. It does not mean rejecting global platforms. It does not mean every organization must train foundation models from scratch.

That would be unrealistic and unnecessary.

AI sovereignty means having enough control over the intelligence layer to make strategic decisions independently.

It means knowing which tasks should use frontier APIs and which should not. It means having alternatives. It means being able to benchmark models internally. It means measuring cost per task, not only tokens. It means understanding where sensitive data flows. It means designing resilience before dependency becomes too expensive to unwind.

In this sense, AI sovereignty is not a technology posture. It is an operating principle.

The leadership question
#

The most important question for leaders is no longer:

Which AI model is best?

That question is too narrow.

The better question is:

Which parts of our organizational intelligence must we control ourselves?

Some intelligence can be rented. Some can be bought. Some can be accessed through APIs. But some must be governed, protected and operated as a strategic capability.

The organizations that understand this early will not abandon frontier models. They will use them intelligently. They will combine them with open models, private deployments, internal benchmarks, orchestration layers and cost governance.

They will move from AI adoption to AI architecture.

And that may become one of the defining differences between organizations that merely use AI and organizations that truly own their intelligence.

Your AI Startup Is Not a Company: It Is a Feature of OpenAI or Anthropic

Tue, 05 May 2026 00:00:00 +0000

And you are building inside a crack that may close with you still inside it.

In March 2025, a solo developer celebrated on Twitter that his PDF summarization tool had reached $12,000 in MRR. He used the GPT-4 API, added a polished interface, charged $19 per month, and life was good. Six weeks later, Google launched Gemini 2.5 with native support for processing 1,500-page documents. For free. The solopreneur’s MRR did not decline gradually — it collapsed like a building in an earthquake.

It was not an isolated case. It was geology.

The tectonic plates of AI
#

Think of OpenAI, Google, Anthropic, and Meta as tectonic plates. They are continental masses of capital, talent, data, and compute capacity that move slowly — but when they move, they reconfigure the entire landscape.

Between those plates there are cracks. Temporary gaps where the ground appears stable. Places where an agile entrepreneur can set up shop, plant a flag, and declare that they have found a market. And for a while, they are right. The crack is real, the space exists, customers pay.

The problem is that cracks between tectonic plates are not solid ground. They are zones of friction. And when the plates move — because they always move — the crack closes.

With the entrepreneur inside.

The cemetery of features that thought they were companies
#

The list is long and keeps growing:

Code Interpreter killed code-execution startups. Dozens of companies built products to execute code inside LLM conversations. Then OpenAI added it as a native ChatGPT feature. There was no transition. There was extinction.

GPT-4 Vision wiped out image-description startups. Companies that charged for analyzing images with AI vanished the day vision became a standard capability of the base model.

“Chat with your documents” tools became a commodity. What in 2023 was a differentiated product is, in 2025, a free feature inside Google Drive, Microsoft 365, and Notion. All at once.

Image-generation wrappers that added improved prompts on top of DALL-E or Midjourney watched each new model release make their “enhancement” layer unnecessary.

The pattern is always the same: an entrepreneur identifies a limitation in the foundation model, builds a solution around that limitation, and celebrates having found product-market fit. But what they found was not a market — it was a temporary bug in a giant’s offering.

The solopreneur trap
#

It has never been easier to build an AI product. A single developer, with Claude Code or Cursor, can have a functional MVP in a weekend. And that feels like a superpower.

But that ease is precisely the trap.

If you can build your product in a weekend, what makes you think OpenAI cannot add that functionality in its next release? You are not competing with other solopreneurs — you are competing with organizations that have thousands of engineers, billions in funding, and access to the foundation model your “company” is built on top of.

It is like opening a souvenir shop inside a dormant volcano. Rent is cheap, the view is spectacular, and tourist traffic is incredible. Until the volcano stops being dormant.

The AI solopreneur is not democratizing technology. They are occupying an unstable market space — installed in a crack that the tectonic movement of hyperscalers can close at any time. And the cruelest part is that the better they do, the more visible they become to the plates that will swallow them. Success is the signal that the crack is worth closing.

The difference between a feature and a company
#

Not everything built on top of an LLM is a doomed feature. Some AI startups really are companies. The difference lies in what they have beyond the model:

Proprietary data the model does not have. If your competitive advantage is that you train or fine-tune on data no one else owns — specific industrial data, regulatory history, specialized corpora — the plates can move and your ground still holds. You are not in the crack; you are on your own island.

Network effects that strengthen with use. Every new user makes the product better for everyone else. A marketplace, a community, a collaboration system. A foundation model cannot replicate that by adding a feature.

Deep integration into existing workflows. If your product is embedded in a company’s daily process — connected to its ERP, CRM, or legacy systems — the switching cost is real. It is not an app that gets uninstalled when the base model improves.

Domain expertise the model cannot replicate. Some sectors have regulation, process specificity, or contextual complexity that require knowledge far beyond what a prompt can solve. Healthcare, legal, regulated finance, industrial manufacturing. There, the model is an ingredient, not the dish.

If your startup does not have at least one of these four elements, you do not have a company. You have a feature with temporary revenue.

The crack test
#

Before celebrating your next MRR milestone, ask yourself these questions:

1. Could they add it in a release? If the core functionality of your product can be replicated by OpenAI, Google, or Anthropic adding a feature to their next version, you are in the crack. It does not matter that they have not done it yet. What matters is that they can.

2. Does your advantage survive an improvement in the base model? Every time a more capable model is released, does your product become more valuable or less necessary? If the answer is “less necessary,” you are betting against gravity.

3. Can you explain your moat without mentioning the model? If your pitch begins with “We use GPT-4 to…”, you have already lost. Your moat must exist independently of the model you use underneath. If you switch from OpenAI to Anthropic to Gemini and your value proposition disappears, it was never your value proposition.

4. Are you selling a capability or an outcome? Capabilities commoditize. Always. “Summarize documents” is a capability. “Reduce your bank’s regulatory compliance cycle from six weeks to three days” is an outcome. Outcomes require context, integration, and expertise that are much harder to commoditize.

What to do if you are in the crack
#

Do not panic, but move quickly. The crack may not close tomorrow — but it will close.

First: accept reality. Your wrapper is not a moat. Your beautiful UI is not a moat. Your prompt engineering is not a moat. None of that protects you from a competitor that controls the model your product is built on.

Second: find your own ground. Can you generate proprietary data? Can you create network effects? Can you integrate so deeply into your customer’s workflow that removing you requires a migration project? If the answer to all of those is no, you have a cash extraction business, not a company. Extract the cash, but do not lie to yourself about what you are building.

Third: build the business that remains when you remove the model. If you take GPT-4 out of your product and nothing is left, you do not have a product. If you take GPT-4 out and what remains is a workflow, a database, a community, an integration — then you have something that may survive the next tremor.

The earthquake ahead
#

The hyperscalers are not going to stop moving. On the contrary — they are accelerating. Every release is more capable, every platform absorbs more functionality, every model makes another abstraction layer unnecessary.

I am not saying it is impossible to build a great business on top of AI. It is possible, and it is happening. But the ones that survive are not the ones that found a crack and opened a shop. They are the ones that built on their own rock.

The question is not whether the plates will move. The question is whether, when they do, you will be standing on solid ground or become another solopreneur celebrating an MRR with an expiration date.

Look down. Do you see the crack?

Carles Abarca is VP of Digital Transformation at Tec de Monterrey and former CTO of Banco Sabadell. He writes about the strategic implications of AI at carlesabarca.com.

Stop Overpaying for Intelligence

Fri, 24 Apr 2026 00:00:00 +0000

The default behavior of most AI-powered products today is simple: when in doubt, call the frontier model.

Every customer message classified by GPT-5. Every extraction task routed through Claude. Every prompt, no matter how trivial, handled by the same multi-trillion-parameter machine that was designed to reason about legal strategy, write pharmaceutical patents, and debug distributed systems.

It works. It’s fast to integrate. It makes the product feel intelligent.

And it is quietly becoming one of the most expensive habits inside the modern AI stack.

The Overpaying Default
#

A few weeks ago I wrote about the end of cheap AI — the moment when subscription limits, rate caps, and honest inference costs finally started reflecting the real economics of frontier models. That’s the macro story.

This is the micro story that sits underneath it.

The reason so many companies are about to feel the squeeze is not only because frontier prices are rising. It’s because the typical architecture was built on a silent assumption: that there was no cost worth worrying about, so the best model should handle everything. That assumption is breaking down from two sides at once.

From one side, frontier inference is getting more expensive, metered harder, and subsidized less.

From the other side — and this is the part most roadmaps haven’t priced in yet — local and open-weight models have quietly become good enough for a very large share of real enterprise tasks.

That combination changes the economics of AI more than any single product announcement this year.

What Local Models Can Actually Do Now
#

A few years ago, “run your own LLM” meant a heroic engineering project, a clear downgrade in quality, and an infrastructure team that secretly missed the cloud.

Today, it doesn’t.

The current generation of open-weight models — Llama, Qwen, Mistral, DeepSeek, Gemma, and their derivatives — has crossed capability thresholds that would have sounded like science fiction in 2023. A 70B-parameter open-weight model running on a single high-end workstation or a modest GPU instance now performs competitively on the benchmarks that mattered most to enterprises at the start of this cycle: general reasoning, code completion, summarization, extraction, translation, structured output.

And it keeps getting better. Fast.

This doesn’t mean frontier and open-weight are interchangeable. They are not. Frontier still pulls clearly ahead on long-context coherence, multi-step agentic planning, novel domain synthesis, and the hardest tiers of code generation.

But what matters for an AI roadmap is not whether local models have caught up on everything. It is whether they are good enough on the specific tasks your system actually performs.

And for most enterprise workloads in 2026, the answer is increasingly yes.

The Map Most AI Architectures Are Missing
#

If you look carefully at what happens inside most AI-powered applications, the workloads split cleanly into two groups.

Tasks that genuinely need a frontier model:

Long-context reasoning across dozens of documents.
Multi-step agentic planning over ambiguous goals.
Complex code generation from scratch in unfamiliar domains.
Creative synthesis that blends multiple expert voices.
Handling highly adversarial or edge-case inputs that require real judgment.

Tasks that almost certainly don’t:

Classifying an email, ticket, or document by type.
Extracting entities, dates, and amounts from a text.
Summarizing a page or two of content.
Rewriting a paragraph in a different tone.
Producing structured output (JSON, SQL) from plain text.
Translating between major languages.
Answering FAQs from a retrieval layer.
Deterministic sub-steps inside a larger agent.

Most AI architectures treat these two groups the same way. They shouldn’t.

The job of any mature AI stack — and I use “mature” here in the adult-phase sense, as opposed to the sugar-rush phase of the last two years — is to route each task to the right tier. Frontier when it earns it. Open-weight when it doesn’t.

The word I’ve been using internally for this is task discrimination. Not in the political sense — in the architectural one. The ability to recognize that different tasks deserve different intelligence budgets, and to design accordingly.

It’s Not Just About Cost
#

Cost is the most visible reason to care about task discrimination. It is not the only one.

There are four other reasons that keep compounding the more deeply an organization uses AI.

Latency. A local 8B or 13B model running next to your application can return a classification in under 100 milliseconds. A round-trip to a frontier cloud API is rarely that fast. For interactive experiences, user-facing agents, or high-frequency internal automations, that gap matters.

Privacy and data residency. Routing every customer email, patient chart, student record, or internal memo through a third-party model is a governance posture that is aging badly. Regulators have noticed. Boards have noticed. For an increasing number of use cases — health, education, legal, defense, government, and anything covered by local data protection regimes — local inference is not an optimization. It is a requirement.

Reliability. When your architecture depends on a single frontier provider, you also depend on their rate limits, their subscription restrictions, their outages, and their commercial roadmap. That is a level of systemic dependency that would raise eyebrows in any other part of the tech stack.

Determinism and control. A smaller model you fully control, fine-tuned or prompt-tuned for a narrow task, often behaves more predictably than a generalist frontier model optimized to handle the entire universe. Predictability is underrated until it is missing.

None of these points is, on its own, a reason to abandon frontier models. All of them together are a reason to stop defaulting to frontier models for everything.

The Numbers Are Not Subtle
#

Let me illustrate with a simple scenario.

Imagine a mid-sized organization running a million lightweight AI calls a month: a mix of classification, extraction, summarization, and structured output. Say the average call uses around a thousand tokens in and out.

Routed through a top-tier frontier model, the inference bill for those calls lands comfortably in the tens of thousands of euros per month. Multiply by twelve, add growth, and this is the kind of line item that starts showing up in CFO reviews.

The same workload, routed through a well-hosted open-weight model — either on-prem or on a dedicated GPU instance at a specialized provider — comes out an order of magnitude cheaper, sometimes two. And the quality difference, on precisely these task types, is typically invisible to end users.

That is not a rounding error. That is the difference between AI being a sustainable operational capability and AI being a line item your CFO starts questioning at every forecast.

And the organizations that realize this first will not use the savings to shrink. They will use them to scale further.

What This Looks Like On My Own Desk
#

The most honest way to write about task discrimination is to describe what I actually run, not what I think other people should run.

In my own setup, I have a Mac Studio dedicated to serving local models to my agents. It sits quietly on a shelf, publishes a private inference endpoint through LM Studio, and hosts a small library of open-weight models optimized for MLX — the framework that lets these models take full advantage of Apple Silicon’s GPU and unified memory.

Nothing about that machine is exposed to the public internet. The endpoint lives inside my own network, behind the boundaries any serious setup demands. For the kind of work I route through it, that is not optional.

I chose a Mac Studio over the obvious alternative — a dedicated GPU rig — for reasons that are not purely technical. It is powerful enough for the model sizes that actually matter to me. It is extraordinarily reliable. It is almost perfectly silent. And its idle power draw is low enough that I can leave it on 24/7 without thinking twice. None of that matters when you are renting H100s by the hour. It matters a lot when the machine is a permanent piece of your operating stack.

The architecture itself is deliberately simple.

The main orchestrator — the LLM that gives my agents their judgment and planning capability — is a frontier model. That is where the hard reasoning happens, where ambiguity has to be resolved, where the whole plan needs to hold together. For that role, paying for the best is worth it.

But underneath the orchestrator, routing rules push subagent tasks to my local endpoint whenever it is possible or recommended. Local handles the grunt work. Frontier handles the thinking.

The result is that my frontier bill has collapsed without any perceptible loss of quality in the end-to-end experience. Not because local has caught up on everything — it has not — but because a very large share of what any agent actually does is not reasoning in the hard sense. It is classifying. Extracting. Summarizing. Reformatting. Translating. Producing structured output.

Models like qwen3.6-35b-a3b-ud-mlx, gemma-4-31b-it-mlx, or gpt-oss-20b-mlx handle these tasks beautifully. Running locally. With latencies a cloud round-trip cannot match. And without sending a single byte of context to a third party.

That is not a theoretical architecture. That is what is running on my desk, today.

So What Should Actually Change?
#

There is no need to rip anything out. There is a need to rearchitect.

At least across five fronts.

1. Build a task taxonomy
#

Every AI call in your product or operations belongs to a complexity tier. Map them. Most teams discover that more than half of their calls sit comfortably in the “does not need frontier” bucket — and have been happily paying frontier prices for them for years.

2. Start with a router, not a migration
#

The highest-leverage first step is not swapping out your model. It is adding an intelligent routing layer — sometimes as simple as “classify intent, then dispatch” — that sends trivial tasks to a cheaper tier and escalates only when confidence is low or complexity is high.

3. Measure cost and quality per task, not per model
#

The question “which model is best?” is the wrong one. The right question is “which model is best for this specific task at this specific cost?” Build the observability that answers that.

4. Treat local as a capability, not a downgrade
#

Open-weight models are no longer a consolation prize. In many workflows they are the right tool — faster, cheaper, more private, more controllable. The teams still talking about them defensively are signaling how recently they last looked.

5. Design for hybrid as the default
#

The interesting AI architectures of 2026 will not be pure-frontier or pure-local. They will be orchestrated systems that blend a frontier model for the hard parts, open-weight models for the routine parts, and fine-tuned small models for the narrow, high-volume parts — each one called when, and only when, it earns its keep.

The Real AI Cost Lever of 2026
#

The dominant narrative this year will continue to focus on the frontier: bigger models, higher benchmarks, sharper capabilities. That narrative is real, and it matters.

But underneath it, there is a quieter shift that will determine which organizations actually build sustainable AI operations — and which ones end up rationalizing aggressive cost cuts in 2027.

The shift is not about choosing between frontier and local. It is about learning to use both, deliberately, at the right moments, in the right combinations.

The cheapest AI optimization available in 2026 is not a better deal from your current provider.

It is the decision to stop using a frontier model for work a local model can do just as well.

Intelligence is becoming abundant. Discernment is becoming the scarce resource.

The companies that will win the next phase of this cycle are not the ones paying the most per call.

They are the ones who have figured out which calls don’t need to be paid at all.

DigitalTransformation on Carles Abarca

From Cloud Lock-In to Cognitive Lock-In

The first wave: buying intelligence through APIs #

The cost problem is not only price. It is uncertainty. #

Cognitive lock-in is deeper than cloud lock-in #

Open source AI is not a cheap Plan B #

The mature AI stack will be hybrid #

AI sovereignty is not isolation #

The leadership question #

Your AI Startup Is Not a Company: It Is a Feature of OpenAI or Anthropic

The tectonic plates of AI #

The cemetery of features that thought they were companies #

The solopreneur trap #

The difference between a feature and a company #

The crack test #

What to do if you are in the crack #

The earthquake ahead #

Stop Overpaying for Intelligence

The Overpaying Default #

What Local Models Can Actually Do Now #

The Map Most AI Architectures Are Missing #

It’s Not Just About Cost #

The Numbers Are Not Subtle #

What This Looks Like On My Own Desk #

So What Should Actually Change? #

1. Build a task taxonomy #

2. Start with a router, not a migration #

3. Measure cost and quality per task, not per model #

4. Treat local as a capability, not a downgrade #

5. Design for hybrid as the default #

The Real AI Cost Lever of 2026 #