Skip to main content
AIDataStrategyAI Enablement

Your Data Is a Mess. Start Your AI Project Anyway.

Most companies are told to clean their data before starting an AI project. That advice is expensive, slow, and wrong. Here's how to bake data cleanup into the AI rollout itself.

Kerrigan BaronApril 15, 20266 min read
Your Data Is a Mess. Start Your AI Project Anyway.

Everyone's got inherited data. Legacy systems, migrations that happened three CTOs ago, naming conventions that contradict each other, and databases that look like a junk drawer someone labeled "important."

The standard advice? Clean it up first. Spend months mapping fields, standardizing names, and refactoring systems before you even think about AI.

That advice is expensive. And slow. And in the AI world, every month of delay is momentum you're handing to your competitors.

Here's the better move: bake the cleanup into the AI rollout itself.

Think of it like inheriting a house from a distant relative. It's packed with decades of stuff. Some of it is valuable. Some of it is junk. And some of it is a mystery wrapped in a filing cabinet from 1997. You don't catalog every drawer before you move in. You start living in it and sort as you go.

Same principle applies to your data.

The Playbook

Let AI do the sorting. AI handles large datasets better than humans staring at millions of rows in a database. Let it catalog what's there, flag what's missing, and surface what needs a human decision. Your people spend time making strategic calls, not counting entries.

Use a clean container strategy. Picture a water filtration system. Dirty water goes in one side, gets filtered, clean water comes out the other. Same premise. Don't touch the source data. Have AI filter it into a new container with better naming conventions, human-approved decisions baked in, and a structure that matches where you're headed. You might run a few filtering cycles. Each one gets you closer.

Let your AI agents talk to each other. Your data cleanup agent and your AI project agent need to be in conversation. Set up a shared channel (I use Slack for this) where they discuss in the open with a human watching the back and forth. It's slower than a direct agent-to-agent pipeline, but it keeps everyone who needs to know in the loop. The project agent learns what data is coming, what's changing names, and how to restructure its calls as the data gets cleaner.

Build a data lake. Information scattered across 20 systems? Welcome to reality. The AI age demands a shift from siloed product data to a comprehensive data lake. Every drop of data in that container gives your AI agents more references, connections, and context. It feels overwhelming to start. That's exactly where your data cleanup AI earns its keep.

Give yourself permission to iterate. This is the hardest one for leaders. Let AI take a first pass solo. If you've containerized properly and kept your source data untouched, the worst case is you reset and run again. The biggest friction in the AI age isn't technical. It's the feeling of control slipping away. But trying to maintain human oversight on every micro-decision means you might as well do it by hand. Let the AI run. Review. Adjust. Repeat.

From the Field

A best practice that saved me real money: ask your data cleanup agent to identify opportunities for mechanical or deterministic sorting before it starts burning tokens on AI processing.

I did this with my personal nemesis. Email. Over 300 emails per hour across multiple legacy accounts. No human was handling that volume. So during the data cleanup process, part of my prompt asked the agent to find mechanical paths forward to reduce token cost. It took a snapshot sample, identified the most common offenders, and built an automatic sorting script that runs without AI at all. Those scripts are self-running now. The data cleanup became an ongoing process instead of a one-off project.

Years ago, I worked on a massive data lake project with a leading US life insurance company. Over 35 federated backend systems. No common naming conventions. Most of them had never released data externally. I did the hard version of this work. Mapping thousands of fields by hand. Tracking down ancient documentation. Wrestling with deprecated datasets. Remapping to the best of my ability. That project won a major industry award, and I'm proud of the work. But I would never have a human do it again. Everything I did there can now be done by AI in a fraction of the time. Containerization, filtering, iteration. That's the formula.

Bottom Line

Your data is messy. That's not a blocker. It's a starting condition.

The companies winning with AI aren't the ones with perfect data. They're the ones who figured out how to clean and build at the same time.

Stop waiting. Start filtering.

Want to know where your organization actually stands? Take the MACH & AI Readiness Assessment to benchmark your data readiness, composable architecture maturity, and AI strategy against enterprise leaders.

Kerrigan Baron

Written by

Kerrigan Baron

CEO & Founder

they/them

Curious where your organization lands?

Take the free MACH & AI Readiness Assessment. Powered by the MACH Alliance Enterprise Technology Report 2026.

Take the Quiz

Related Articles

Your Data Is a Mess. Start Your AI Project Anyway.
AIData

Your Data Is a Mess. Start Your AI Project Anyway.

Most companies are told to clean their data before starting an AI project. That advice is expensive, slow, and wrong. Here's how to bake data cleanup into the AI rollout itself.

April 15, 20266 min read
Not Everything Needs AI: A Guide to Deterministic vs Intelligent Workflows
AIStrategy

Not Everything Needs AI: A Guide to Deterministic vs Intelligent Workflows

AI isn't the answer to everything. Learn when deterministic logic outperforms AI in your workflows, how to avoid AI runaway, and where intelligence actually earns its token cost.

March 14, 20266 min read
We Built a Free MACH & AI Readiness Assessment. Here's Why.
MACHAI

We Built a Free MACH & AI Readiness Assessment. Here's Why.

Fidget Labs built a free interactive assessment that scores your composable maturity and AI readiness against 600 enterprise leaders.

March 3, 20268 min read
What 3D Printing Can Teach Us About AI Enablement
AIStrategy

What 3D Printing Can Teach Us About AI Enablement

A practical guide to avoiding expensive mistakes in AI adoption. Learn how 3D printing lessons reveal the truth about AI enablement, from credit waste to procurement traps.

February 16, 202614 min read
The Evolution of Digital Experience: From CMS to DXP to MACH to... What's Next?
MACHComposable

The Evolution of Digital Experience: From CMS to DXP to MACH to... What's Next?

If you're still trying to figure out the difference between a CMS, a DXP, and MACH architecture—or wondering why everyone keeps talking about "composable"—you're in the right place.

February 5, 202615 min read
The Myth of the Greenfield Project: Why Your "Fresh Start" Is Already Connected to Everything
EnterpriseStrategy

The Myth of the Greenfield Project: Why Your "Fresh Start" Is Already Connected to Everything

Every enterprise leader dreams of the greenfield project—a blank canvas, no legacy baggage, pure innovation. Here's the uncomfortable truth: in enterprise technology, greenfield doesn't exist. But that's actually good news, if you know how to work with it.

January 24, 202612 min read
Avoiding Headless Heartache: A Survival Guide for Your MACH Journey
MACHComposable

Avoiding Headless Heartache: A Survival Guide for Your MACH Journey

So you've decided to go composable. Congratulations! You're about to join a growing movement of enterprises breaking free from monolithic madness. But before you dive headfirst into headless, let's talk about the heartaches we've seen—and how to avoid becoming another cautionary tale.

December 18, 202515 min read
Your Data Is a Mess. Start Your AI Project Anyway. | Focal Point - Fidget Labs