I had a conversation recently with a CEO. They had rolled out an AI agent, the sales team had been using it for six months. I asked: "So what did it bring you?" A small pause. "Well… I feel like we're faster."
That's it.
That's the problem. Not that AI doesn't work — the AI was probably working perfectly fine. The problem is that after six months the CEO couldn't tell me what they got for their money. And if they can't tell me, next year they won't spend more on it. Even if they should.
The famous 80%
Gartner, McKinsey, MIT — they all say the same thing: 70-85% of AI projects fail to deliver "significant business impact". It sounds scary. Except the definition is very different from what plays in your head.
"Failure" here means: leadership cannot demonstrate the return. Not that there was no return. Just that it isn't visible. And those are two very different things.
What isn't visible doesn't exist. That's the executive reality. It doesn't matter that your sales team saves 4 hours per person per week if no one wrote down how much time they spent on it before. It doesn't matter that lead conversion went from 12% to 15% if no one captured the baseline and now you can't prove it was the AI, not the redesigned landing page.
Why is this so hard to get right?
The ROI of an ERP system is simple. Saves 5 hours a day for 10 people, done, calculator, pays back in 16 months.
AI doesn't work like that. AI generates value in layers. There's the obvious one — saved time. Then there's the harder one to measure — customers don't churn because you reply faster. And then there's the almost-impossible one — you started learning two years before your competitors, and those two years of head start will never come back to them.
Most companies measure only the first layer. Yet financially that's usually the smallest item.
The real question isn't "how much did it bring"
It's how do you know.
This is what I tell clients: if right now, before the rollout, you can't answer what you're going to measure in the next six months — then don't roll anything out. Because in six months you still won't be able to answer, you'll just have spent the money by then.
You don't need a PhD for this. Three questions are the minimum:
What are you doing badly / slowly / expensively right now? A specific process. Not "our operations" but "sending out a quote takes an average of 3 days".
How will you know it improved? A number. "From 3 days to 1 day within 90 days." Not "it'll be faster".
What will you do if it hasn't improved after 6 months? This question almost no one asks themselves. Yet it's the most important one. If there's no pre-decided shutdown criterion, you'll never shut down — you'll just drag it out, keep spending, keep explaining.
Measurement isn't Big Brother
This is the part I most often hear misunderstood. "Too bureaucratic", "I don't want to surveil my colleagues", "it takes time away from real work".
Yet capturing a baseline is a 30-minute conversation. "How many hours a week do you spend on the CRM?" — "About 6 or 7." Written down, done. Six months later you ask again. That's the basis of measurement. Not a dashboard, not token tracking, not AI confidence rates. One question and one number.
The complicated stuff — the financial value of churn prevention, monetizing NPS points, control-group A/B tests — those come later. If at all. At a 50-person company you can capture 80% of the value with that 30-minute conversation and one Excel cell.
Executive communication
This is where I see the biggest damage. A well-measured AI project gets badly sold.
The board doesn't want to see 12 KPIs on a PowerPoint slide. The board wants to hear "it returned 70% ROI, the three main contributors are this and this, next year we're going to 110%". Three sentences. Plus a human story, because they remember those. "Anna in sales used to spend 6 hours a week on admin, now it's 1." They'll quote that a year from now, while the financial monetization of NPS points won't get mentioned.
And if someone in leadership asks — and someone will — that's when the detailed data comes out. Behind it sits the dashboard, the baseline, the calculation. But you don't lead with that.
What not to measure
The first three months. The AI is learning, the users are learning, the data is cleaning up. Month one's ROI is often negative, and that's completely fine. Anyone who decides after month one is making a mistake. Anyone who doesn't decide for a whole year is also making one.
Vanity metrics. "We generated 10,000 AI interactions this month" — nice. Business value? No one knows. That's not ROI, that's just traffic.
Don't measure strategic investments by the same yardstick. If you're building data infrastructure so that two years from now you have something to put AI on, don't demand direct ROI now. You need different KPIs — data quality, coverage, usage. Financial return comes in the next round.
The cost of not doing it
And finally a thought that's almost never said out loud. The ROI debate usually focuses on what you gain. But equally important: what you lose if you don't.
If your competitor rolls it out and you don't, that will show up in market position 1-2 years from now. By then it's too late to measure. The unmeasurable but real risk is sometimes a stronger argument than the quantified upside.
If you want to go deeper: we worked out the detailed methodology — the 4-layer ROI model, the full formula, the 12 KPIs, the 7-step rollout process and a worked example for a 50-person company — in our knowledge base piece on Measuring AI ROI.