All insights
Strategy

Counting hours, not tokens.

Field note5 min readMustafa Mujahid

The wrong way to make a budget case for AI is to lead with model benchmarks. The right way is to lead with the line on the P&L the work touches.

Benchmarks are interesting to engineers. The CFO doesn't have a budget category for "MMLU score." They have one for headcount and one for vendor spend. If the AI workflow doesn't show up against either of those, it doesn't get a second year.

The three numbers that travel

We frame every Implement engagement against three columns. They're not always all positive, sometimes the answer is that the workflow saves hours but adds vendor cost, and the trade is what's being negotiated.

  • Hours reclaimed. Operator hours that used to be spent on the workflow and are now spent elsewhere. This is the number that lets a department lead reassign capacity instead of asking for headcount.
  • Errors avoided. Mistakes the workflow used to surface, missed SLAs, mis-routed cases, billing errors, that the AI is now catching or preventing. Easy to undercount; very visible when it goes wrong.
  • Capacity unlocked. Work the team couldn't do before because they were full doing the manual version. Backlog reduction, faster response times, new client segments served.

What we don't lead with

Token spend, model accuracy, latency, vendor logos. Those are all real and important, they live in the technical addendum, not the executive summary. If the executive summary leads with them, you've already lost the argument.

The framing exists because most AI rollouts that get killed in year two don't get killed for technical reasons. They get killed because nobody could explain to the new CFO why the line item exists. The work to translate from "tokens per request" to "hours per case" is the work that makes the line item survive a budget review.

The CFO doesn't need to understand the model. They need to understand what the model did to a number they already track.

If you can't draw a line from the AI workflow to a number on the P&L, the audit isn't done. We'd rather find that out in week two than in budget season.

Related field notes

All insights
Strategy

Why most AI pilots stall at month four.

The pattern is consistent: a strong PoC, a quiet handover, no owner. The system limps along until the next reorg quietly buries it.

Governance

The five questions your board will ask before sign-off.

Where the data goes, who approves the output, what the off-switch is, what the cost ceiling is, and what happens when it gets it wrong.

Implementation

Build, buy, or stitch: the three-question filter.

Most teams reach for a build when a stitched-together set of off-the-shelf tools would ship in a tenth of the time.

Trust the transition.

A 30-minute fit call. No deck. We'll tell you whether AI is the right move, honestly.

Book a fit call
Book a fit call