My friend Christian Martinez had a mess.
Three Excel tabs. Billing transactions on one. GL accounts on another. Bank reconciliation on the third. Cross-referenced rows were incomplete and dates had three different formats. Duplicates existed and GL codes were missing.
He needed to clean it. Or have a complete nightmare on his hands.
So he did something super smart.
Instead of spending two days manually fixing things, he used AI to automate the entire process.
(this is why he teaches inside my AI Finance Club)
But – and this is really important – he never changed the raw data. He created a cleaned version beside it.
Most people don't do what Christian did.
They take that same mess and give it to AI and expect it to work.
But AI doesn't always push back. It doesn't say "hey, your dates are in two different formats, which one do you want?"
It just picks one. Then it gives you something that looks good, but is completely wrong underneath.
I see this all the time. A forecast that looked perfect but had double-counted revenue. Why? Because nobody flagged the duplicate rows before uploading. The AI treated every single row as fact.
And this is the thing that I want to tell you today.
You do not need perfect data to use AI.
But you need to understand what's broken first, because AI can take your existing problems and make them worse. In a way that looks really good on screen!
Your Data Has 3 Problems
|
|
You have messy data. There is no getting away from this.
I have never met anyone with perfect data.
57% of companies say data reliability is their biggest blocker against AI success (Informatica – CDO Insights 2026).
Plus, most companies' governance hasn't kept up with their speed of AI adoption.
So, you've bought the AI, trained people and run pilots. But you have not given your data the same attention.
As I've said before: "If you use AI with those tables that are made for humans, you are not going to get a good output."
And the reason is simple. Your data has three core problems, which AI can make worse if you are not careful..
Problem 1: Inconsistent formats
Your dates are in three different formats. Some columns use thousands, others use ones. One system says "United States", another says "US", a third says "USA."
When you're in the spreadsheet yourself, you notice this. And you fix it as you go.
AI might not notice. It just makes an assumption about which format is correct and moves on.
Problem 2: Missing data
What if GL codes are blank for 15% of your transactions?
You'd flag them and ask someone to fill them in.
AI might fill them in too, but it doesn't ask you first. It looks at the patterns in your data and makes its best guess. And those guesses are the dangerous part because they look like they could work.
Nothing tells you which ones are real and which ones aren't.
Problem 3: Duplicates and Dirty records
When you work with this data manually, duplicates are frustrating but you spot them eventually.
With AI, every row can get treated as truth. So your duplicates don't cancel out and things get inflated.
But nobody questions this until someone tries to reconcile back to the general ledger and the numbers don't match.
So. These three problems are nothing new. You've been dealing with them your entire career.
The difference now is that AI sits between you and the data. And instead of showing you the errors, it hides behind a clean-looking output.
But… You already know how to fix this!
You reconcile monthly. You audit internal controls. You understand IFRS and SOX. And those same skills apply to data and AI.
You just need an AI method to clean and audit.
Red, Amber or Green?
|
|
Before you even begin thinking about cleaning your data, you need to understand it.
So, pick your top three data sources. Then score each one across these five dimensions using RAG (Red/Amber/Green):
Completeness. Are all required rows and columns populated? Green = everything there. Red = more than 10% missing.
Consistency. Are all formats standardized? One dataset has dates as DD/MM/YYYY, another as MM/DD/YYYY. One has currency in thousands, another in ones. One codes country as "United States", another as "US". Green = all matching format. Red = multiple formats across the dataset.
Timeliness. How fresh is the data? Green = refreshed daily. Red = refreshed monthly or slower.
Accuracy. Spot-check 10 rows. Do they match your general ledger or bank statements? Green = 100% of spot checks match. Red = less than 80% match.
Accessibility. Can AI actually read it? Is it locked? Is it a PDF? Does it have hidden columns? Is it multiple nested tabs? Green = CSV or clean single-tab structure. Red = PDF, image, locked, or deeply nested.
You'll usually find Amber ratings on Completeness and Red on Consistency.
And this is exactly where you start.
Don't try to fix everything at once.
How to Clean Your Data Using AI
So, let me show you how to do what Christian did, step by step.
|
|
Step 1: Prepare your raw data.
Export your messiest dataset, the one with duplicates, formatting issues, missing values (the ones that showed up Amber or Red).
Have it ready in Excel or CSV.
Step 2: Use this prompt
|
|
Upload your data (make sure to use a secure AI tool and its reasoning mode) and ask:
2. [List: e.g., missing GL account codes for 15% of transactions]
– Identify all duplicates and flag them on a new tab (don't delete yet)
– Standardize all dates to YYYY-MM-DD
This will clean your data, but it will also give you an audit trail.
Step 3: Verify the output
Check four things:
- Input totals match output totals. Sum revenue on original vs cleaned. Should be identical.
- Line counts make sense. 4,200 became 3,950? Check your duplicate flags.
- Subtotals by dimension are stable. Revenue by country should match before and after. If Country X jumped from £1.2M to £3.2M, that's a red flag.
- Visual anomaly check. Bar chart of revenue by month. I call this "the big elephant in the room." Things that are broken show up visually before they show up in the numbers.
Step 4: Build your governance register
Now that you have clean data, you need control. Christophe Atten in a recent workshop raised the accountability question:
"Who is accountable if AI produces wrong outputs?"
Without clear governance, you're at risk (from 7 years as an Auditor for PwC, trust me when I say this).
So, make sure to create one simple Excel register:
|
|
Update it monthly.
This register answers: What AI is running? Who owns it? Is the data ready? What's the risk?
It's your control layer, and it also provides accountability.
Step 5: Run an AI self-audit, then manually verify
Don't ask AI to fix things to begin with. Ask it to audit and flag.
Upload your cleaned data and use this prompt:
Take every flag and reproduce it in Excel using formulas. Build a reconciliation tab:
- Did input totals match output totals? Yes/No
- Are line count changes explained? Yes/No
- Are subtotals by dimension stable? Yes/No
This reconciliation tab becomes your audit trail. When someone asks "Why did the forecast number change?" you can show them.
One last tip – If you're working with a lot of data that requires a lot of different cleaning. I find that chunking it up works better. Remember, AI can still only work with so much data at once, so splitting it up will produce better results.
The One Thing to Remember
If you skip the method and just throw your data at AI, you already know what happens.
You can end up with more problems than you started with.
So, use AI to help you fix your data gaps. But do it the way Christian did. Keep the raw data untouched, build an audit trail and put governance around it.
Do this right, and every time you use AI, instead of getting worse, your data gets cleaner, your outputs get better, and your team trusts the process more.
The sooner you do this, the sooner your data will improve.
So make sure you start now.
Best,
Your AI Finance Expert,
– Nicolas
P.S. – Are you struggling with data right now? Hit reply and tell me. I read every reply.
P.P.S. – If you're interested to know more of such tips, here's 9 Power Moves to Make Your Finance Work 7x More Efficient with ChatGPT.
*Datarails Study – The CFO’s Office 2.0: The 2026 AI Transformation facing Finance Teams
link
