Don't Panic! Getting Your Data Ready for AI Without Boiling the Ocean

Paul Sala
May 25
6 min read

When the AI team says you have a data problem, here's how to hear it as an opportunity rather than a crisis

Paul Sala | Co-Founder, Ibex Ascent | May 2026 | 8 min read

Every serious AI conversation eventually becomes a data conversation.

That is usually the moment the room gets quiet.

Executives know what is sitting underneath the surface. Duplicated records. Inconsistent definitions. Ageing platforms. Unofficial spreadsheets that somehow became the source of truth. Knowledge bases that nobody quite trusts. Reports that tell different stories depending on which system you pull from. And a large portion of critical process knowledge that exists only inside the heads of people who have been around long enough to know how things really work.

The stakes are real.

42% of companies abandoned most of their AI initiatives in 2025, up from just 17% the year before (S&P Global). Gartner reports that 85% of AI projects fail due to poor data quality or lack of relevant data. The pattern is consistent: it is rarely the model that lets organisations down. It is the foundation underneath it.

So when someone says "we need to get our data right before we can do AI properly," it can feel like the opening line of a very expensive conversation. Fix the whole data estate. Clean every system. Standardise every definition. Rebuild governance from scratch. Then, eventually, perhaps, the organisation can start doing something useful with AI.

That is the wrong starting point. Organisations that treat data readiness as a prerequisite transformation programme tend to stall before they ever ship anything, while the pressure to show AI progress keeps building around them.

The practical question is not: is all our data ready for AI?

The better question is: what data and knowledge need to be ready for this AI use case to deliver value safely?

That distinction changes everything about the size, cost and risk of the work ahead.

The Two Mistakes Most Organisations Make

Before describing a better approach, it is worth naming the two failure modes clearly, because most organisations lean toward one of them.

The first is paralysis. The organisation treats data readiness as a prerequisite transformation, scopes it at enterprise level, and either never starts or stalls under the weight of it. The AI programme idles while the data programme runs. Momentum and stakeholder confidence drain away together.

The second is avoidance. The organisation treats data readiness as a box to tick, connects the AI to whatever data happens to be available, and discovers the gaps through user complaints, wrong answers and eroded trust. Fixing data problems after deployment is significantly harder, more expensive and more politically damaging than addressing them before.

A pragmatic approach avoids both. It starts with a specific business outcome, identifies the data that matters for that outcome, prepares it to a level that matches the risk, and proves it works before expanding. Gartner's research makes the point directly: organisations that focus on scope management rather than wholesale data transformation are the ones that actually ship.

There Is a Method to This

Practitioners who do this well follow a structured approach, even if it rarely looks like a formal programme from the outside. It moves through a connected set of disciplines, each applied to the use case at hand rather than the enterprise as a whole.

It starts with the outcome, not the data estate. The use case defines the boundary. An AI assistant helping customer service agents does not need every piece of customer data in the enterprise. It needs the information required to resolve customer issues safely: customer history, product details, entitlements, policies, service procedures and known exceptions. Defining that boundary is the first act of discipline, and it immediately makes the work smaller and more tractable.

It matches the data standard to the level of risk the AI actually carries. There is a meaningful difference between AI that helps a person find information, AI that recommends an action, and AI that takes action autonomously. Each carries a different risk level and requires a different standard of data rigour. Applying enterprise-grade governance to a low-risk internal assistant wastes time and credibility. Applying chatbot-level thinking to an AI operating inside a regulated process creates real exposure.

It accounts for the knowledge that never made it into any system. Research suggests that up to 90% of workplace knowledge is tacit, residing in people's experience rather than in documented processes. IBM research puts a sharper point on it: 68% of enterprise data remains unanalysed, trapped in silos and undocumented processes. Before AI, organisations survived this informality because experienced employees filled the gaps with judgement. AI does not inherit that judgement.

The good news is that this is where generative AI itself becomes a genuine asset. Converting tacit knowledge into usable, structured information used to be a slow and expensive exercise in workshops, documentation sprints and knowledge management projects. Now it can be dramatically accelerated. Interviews, process walkthroughs and operational stories can be summarised, structured and turned into draft decision rules, exception logs and test scenarios at a pace that was simply not possible before.

AI drafts. Humans validate. Governance publishes. Operations maintains.

Converting tacit knowledge no longer requires a separate programme. It requires the right approach.

It builds the minimum foundation needed for the use case, and resists the pressure to prepare for every possible future. This is where the instinct to over-engineer causes the most damage. The concern is often well-intentioned: if we are going to invest in data readiness, we should prepare for everything AI might eventually do, not just what it is doing today. In practice, that logic consistently produces programmes that are too large to finish, too expensive to sustain and too abstract to demonstrate value. Data readiness is iterative. Build for where the AI is today, prove it works, and extend the foundation when the use case genuinely expands.

Gartner predicts that through 2026, organisations will abandon 60% of AI projects unsupported by AI-ready data. The trap is not failing to prepare enough. It is preparing for the wrong scope.

It proves trust through testing, not assertion. A fluent, confident AI answer is not the same as a reliable one. Technical testing tells you whether the system retrieved a document. Business testing tells you whether the answer is one the organisation would be willing to stand behind. The feedback loop between users, AI outputs and the data and knowledge underneath is not optional. Without it, quality problems stay vague and frustrating. With it, every failure becomes a way to improve.

It makes data ownership and quality part of normal operations. AI data readiness does not usually require a new operating model. It requires the organisation to follow the one it already claims to have. Data ownership that exists only on paper, source-of-truth decisions that live in someone's memory, quality expectations that were never made explicit: these are not new problems. AI just makes them consequential in ways that are harder to ignore.

An Unexpected Upside

There is something worth naming that tends to surprise executives who go through this work seriously.

The disciplines that make data fit for AI, clear ownership, agreed definitions, explicit quality standards, documented exceptions, trusted sources, are the same disciplines that make an organisation's operating model work properly in the first place. Many organisations have been papering over governance and process gaps with informal human judgement for years. AI does not create those gaps. It exposes them.

Executives who approach data readiness as an AI prerequisite often find that the real return is broader. Decisions get made on better information. Process exceptions get documented rather than carried in people's heads. Ownership ambiguities that caused friction for years get resolved. The AI programme becomes the catalyst for operational improvements that would have been worth making regardless of AI.

That is not a reason to scope the work any more broadly than the use case requires. But it is worth knowing that the investment tends to compound in ways that go beyond the AI programme itself.

The Right Response When the AI Team Raises Data Readiness

The right response is not to panic, and it is not to minimise.

It is to ask the practical questions. What outcome are we improving? What data and knowledge does this use case actually need? What risk level does the AI carry, and what does that mean for how carefully we need to prepare?

What knowledge currently lives in people's heads that needs to be made explicit? Who owns the data, and can they answer for it? How will we test whether the AI is using it correctly?

Those questions are not a bureaucratic exercise. They are the foundation for an AI programme that delivers value without creating the kind of trust-damaging failures that are increasingly visible across the industry.

The organisations getting this right are not the ones with the cleanest data estates. They are the ones that treat data readiness as a scoped, structured discipline rather than either a crisis or an afterthought.