Sign up for our newsletter

    Join the Newsletter

    Subscribe to get our latest content by email.
      We won't send you spam. Unsubscribe at any time.

      Your data has 3 problems and AI is making all of them worse (fix inside)

      My friend Christian Martinez had a mess.

      Three Excel tabs. Billing transactions on one. GL accounts on another. Bank reconciliation on the third. Cross-referenced rows were incomplete and dates had three different formats. Duplicates existed and GL codes were missing.

      He needed to clean it. Or have a complete nightmare on his hands.

      So he did something super smart.

      Instead of spending two days manually fixing things, he used AI to automate the entire process.

      (this is why he teaches inside my AI Finance Club)

      But – and this is really important – he never changed the raw data. He created a cleaned version beside it.

      Most people don't do what Christian did.

      They take that same mess and give it to AI and expect it to work.

      But AI doesn't always push back. It doesn't say "hey, your dates are in two different formats, which one do you want?"

      It just picks one. Then it gives you something that looks good, but is completely wrong underneath.

      I see this all the time. A forecast that looked perfect but had double-counted revenue. Why? Because nobody flagged the duplicate rows before uploading. The AI treated every single row as fact.

      And this is the thing that I want to tell you today.

      You do not need perfect data to use AI.

      But you need to understand what's broken first, because AI can take your existing problems and make them worse. In a way that looks really good on screen!


      Your Data Has 3 Problems

      You have messy data. There is no getting away from this.

      I have never met anyone with perfect data.

      57% of companies say data reliability is their biggest blocker against AI success (Informatica – CDO Insights 2026).

      Plus, most companies' governance hasn't kept up with their speed of AI adoption.

      So, you've bought the AI, trained people and run pilots. But you have not given your data the same attention.

      As I've said before: "If you use AI with those tables that are made for humans, you are not going to get a good output."

      And the reason is simple. Your data has three core problems, which AI can make worse if you are not careful..


      Problem 1: Inconsistent formats

      Your dates are in three different formats. Some columns use thousands, others use ones. One system says "United States", another says "US", a third says "USA."

      When you're in the spreadsheet yourself, you notice this. And you fix it as you go.

      AI might not notice. It just makes an assumption about which format is correct and moves on.

      Problem 2: Missing data

      What if GL codes are blank for 15% of your transactions?

      You'd flag them and ask someone to fill them in.

      AI might fill them in too, but it doesn't ask you first. It looks at the patterns in your data and makes its best guess. And those guesses are the dangerous part because they look like they could work.

      Nothing tells you which ones are real and which ones aren't.

      Problem 3: Duplicates and Dirty records

      When you work with this data manually, duplicates are frustrating but you spot them eventually.

      With AI, every row can get treated as truth. So your duplicates don't cancel out and things get inflated.

      But nobody questions this until someone tries to reconcile back to the general ledger and the numbers don't match.


      So. These three problems are nothing new. You've been dealing with them your entire career.

      The difference now is that AI sits between you and the data. And instead of showing you the errors, it hides behind a clean-looking output.

      But… You already know how to fix this!

      You reconcile monthly. You audit internal controls. You understand IFRS and SOX. And those same skills apply to data and AI.

      You just need an AI method to clean and audit.


      Red, Amber or Green?

      Before you even begin thinking about cleaning your data, you need to understand it.

      So, pick your top three data sources. Then score each one across these five dimensions using RAG (Red/Amber/Green):

      Completeness. Are all required rows and columns populated? Green = everything there. Red = more than 10% missing.

      Consistency. Are all formats standardized? One dataset has dates as DD/MM/YYYY, another as MM/DD/YYYY. One has currency in thousands, another in ones. One codes country as "United States", another as "US". Green = all matching format. Red = multiple formats across the dataset.

      Timeliness. How fresh is the data? Green = refreshed daily. Red = refreshed monthly or slower.

      Accuracy. Spot-check 10 rows. Do they match your general ledger or bank statements? Green = 100% of spot checks match. Red = less than 80% match.

      Accessibility. Can AI actually read it? Is it locked? Is it a PDF? Does it have hidden columns? Is it multiple nested tabs? Green = CSV or clean single-tab structure. Red = PDF, image, locked, or deeply nested.

      You'll usually find Amber ratings on Completeness and Red on Consistency.

      And this is exactly where you start.

      Don't try to fix everything at once.


      How to Clean Your Data Using AI

      So, let me show you how to do what Christian did, step by step.

      Step 1: Prepare your raw data.

      Export your messiest dataset, the one with duplicates, formatting issues, missing values (the ones that showed up Amber or Red).

      Have it ready in Excel or CSV.

      Step 2: Use this prompt

      Upload your data (make sure to use a secure AI tool and its reasoning mode) and ask:

      I have a financial dataset combining [describe: e.g., billing transactions, GL accounts, bank reconciliation]. The specific problems are:
      1. [List: e.g., duplicate rows based on invoice ID][List: e.g., dates in three different formats: DD/MM/YYYY, MM/DD/YYYY, and text like 'Jan 15']
      2. [List: e.g., missing GL account codes for 15% of transactions]
      3. [List: e.g., data spread across three tabs that need to be consolidated]
      I need you to:

      – Identify all duplicates and flag them on a new tab (don't delete yet)
      – Standardize all dates to YYYY-MM-DD
      – Fill missing GL codes by matching transaction descriptions to this reference list [paste your GL mapping if you have one]
      – Consolidate multi-tab data into one clean output
      – Build everything using formulas, not static values
      – Create a summary tab showing record counts before and after, by key dimensions
      – Leave the original raw data completely untouched

      This will clean your data, but it will also give you an audit trail.

      Step 3: Verify the output

      Check four things:

      1. Input totals match output totals. Sum revenue on original vs cleaned. Should be identical.
      2. Line counts make sense. 4,200 became 3,950? Check your duplicate flags.
      3. Subtotals by dimension are stable. Revenue by country should match before and after. If Country X jumped from £1.2M to £3.2M, that's a red flag.
      4. Visual anomaly check. Bar chart of revenue by month. I call this "the big elephant in the room." Things that are broken show up visually before they show up in the numbers.

      Step 4: Build your governance register

      Now that you have clean data, you need control. Christophe Atten in a recent workshop raised the accountability question:

      "Who is accountable if AI produces wrong outputs?"

      Without clear governance, you're at risk (from 7 years as an Auditor for PwC, trust me when I say this).

      So, make sure to create one simple Excel register:

      Update it monthly.

      This register answers: What AI is running? Who owns it? Is the data ready? What's the risk?

      It's your control layer, and it also provides accountability.

      Step 5: Run an AI self-audit, then manually verify

      Don't ask AI to fix things to begin with. Ask it to audit and flag.

      Upload your cleaned data and use this prompt:

      Review this financial dataset. Identify any anomalies or gaps. For each issue you find, tell me:
      1. What you found (e.g., 'July 2025 shows zero revenue for Market Segment B')
      2. Why it's unusual (e.g., 'This segment typically averages £400k monthly')
      3. What might explain it (e.g., 'Reporting lag, data exclusion error, or business shutdown')
      4. Do NOT change anything. Just flag it.

      Take every flag and reproduce it in Excel using formulas. Build a reconciliation tab:

      • Did input totals match output totals? Yes/No
      • Are line count changes explained? Yes/No
      • Are subtotals by dimension stable? Yes/No

      This reconciliation tab becomes your audit trail. When someone asks "Why did the forecast number change?" you can show them.

      One last tip – If you're working with a lot of data that requires a lot of different cleaning. I find that chunking it up works better. Remember, AI can still only work with so much data at once, so splitting it up will produce better results.


      The One Thing to Remember

      If you skip the method and just throw your data at AI, you already know what happens.

      You can end up with more problems than you started with.

      So, use AI to help you fix your data gaps. But do it the way Christian did. Keep the raw data untouched, build an audit trail and put governance around it.

      Do this right, and every time you use AI, instead of getting worse, your data gets cleaner, your outputs get better, and your team trusts the process more.

      The sooner you do this, the sooner your data will improve.

      So make sure you start now.

      Best,

      Your AI Finance Expert,

      – Nicolas

      P.S. – Are you struggling with data right now? Hit reply and tell me. I read every reply.

      P.P.S. – If you're interested to know more of such tips, here's 9 Power Moves to Make Your Finance Work 7x More Efficient with ChatGPT.

      video preview

      *Datarails Study – The CFO’s Office 2.0: The 2026 AI Transformation facing Finance Teams

      link

      Share this:

      Join our newsletter

      Smarter Work, Weekly. AI workflows + finance insights.

        Other posts you might be interested in:

        Your team is delivering 10x more inaccurate work with AI (here’s the fix)

        Tell me, do you spend a lot of time reviewing the work of your team,…

        Your data has 3 problems and AI is making all of them worse (fix inside)

        My friend Christian Martinez had a mess. Three Excel tabs. Billing transactions on one. GL…

        You’re wasting 70 days of strategic data (here’s how to get it back)

        What if I told you you’re saying goodbye to 70 full working days of important…

        Why nobody listens to your finance presentations (3-word fix inside)

        If you feel you are falling behind with AI, this year is your chance to…