In a previous piece, I argued that large language models are not enterprise architecture. The response was clear: that argument is hard to dismiss. The harder question is what comes next: “if not this, then what?”
It’s the right question. Because the problem was never that AI doesn’t work. It clearly does. The problem is that we tried to place it in the wrong layer.
We didn’t fail at AI. We failed at where we put it.
Over the last two years, companies have invested tens of billions into generative AI. The result is not ambiguity. It’s clarity.
A growing body of research, including a widely cited MIT study, shows that around 95% of enterprise generative AI initiatives fail to deliver measurable business impact, despite widespread adoption.
This is not because the models don’t work: it’s because they were inserted into organizations as tools, not as systems. We tried to bolt intelligence onto workflows. What we need is systems where intelligence is the workflow.
From stateless tools to persistent systems
Large language models are, by design, stateless: each interaction starts from scratch unless we artificially reconstruct context.
Companies are the opposite. They are stateful systems: they accumulate decisions, track relationships, evolve over time, and depend on continuity.
This mismatch is not a minor inconvenience. It is structural. Research on enterprise AI failures consistently points to the same issue: systems fail not because they generate bad outputs, but because they cannot integrate into ongoing processes or maintain context over time.
Enterprise AI cannot be session-based. It has to remember.
From answers to outcomes
We optimized AI to answer questions. But companies need systems that change outcomes. This is where the gap becomes obvious: an LLM can generate a compelling sales strategy, but it cannot track whether it worked, adapt based on results, coordinate execution across teams or improve over time.
That’s not a limitation of implementation: it’s a limitation of design.
The same MIT research describes a “GenAI Divide”: organizations are stuck in high adoption but low transformation, precisely because current systems don’t close the loop between action and outcome.
Answers don’t change companies: systems do.
From prompts to constraints
Much of today’s AI conversation revolves around prompts. But prompts are just an interface. Companies don’t operate through prompts, they operate through constraints: compliance rules, permissions, risk thresholds and operational boundaries.
And this is where most AI systems break. They generate within probabilities. Companies operate within constraints.
This is one of the least discussed and most important reasons why enterprise AI initiatives stall. Even broader AI research shows that projects fail when systems are not aligned with real-world constraints, workflows, and decision contexts.
Prompts are UX. Constraints are architecture.
From copilots to systems of action
The dominant metaphor of the last two years has been the “copilot.” It sounds appealing, but it’s also misleading. A copilot suggests. A company needs systems that act. This distinction matters, because suggesting is cheap. Executing is hard.
Execution requires:
- integration with systems of record
- coordination across processes
- ownership of outcomes
- adaptation over time
And this is precisely where most current approaches collapse. Not because they are poorly implemented, but because they were never designed for it.
The architecture shift no one is talking about
What, then, replaces this? Not better prompts, not bigger models, and definitely, not more infrastructure. The next phase of enterprise AI will be defined by something else entirely:
Systems that combine
- persistent state
- embedded workflows
- continuous learning from outcomes
- operation under constraints
- integration with real environments
In other words: systems that don’t just generate language about the world, but operate within it.
Research and practice are converging on the same conclusion: success comes not from generic tools, but from systems that adapt, learn, and embed into workflows.
Why this shift will feel like a discontinuity
We are still early in this transition. Most organizations are investing in the visible layer: models, interfaces, infrastructure. But the real shift is happening one layer deeper.
And when it becomes visible, it won’t look like an incremental improvement: it will look like a discontinuity. Because we are not moving from “worse AI” to “better AI.” We are moving from tools that talk to systems that act.
The real opportunity
This is not the end of enterprise AI: it is the end of a misconception. Language models are not enterprise architecture, they are an interface layer. A powerful one, but insufficient on its own.
The companies that understand this first won’t simply deploy AI better. They will build something their competitors won’t recognize until it’s too late.
