
Managing Director AI Strategy, Americas
“Human in the loop” has become a reflex. Slap a checkpoint on every AI workflow, call it responsible, move on. But a checkpoint without a reason isn’t governance. It’s a bottleneck cosplaying as caution.
The agentic era demands a sharper question. Not “should a human review this?” but “does a human reviewing this actually change the outcome?” Be honest. In workflow after workflow, the answer is no. And that honesty is where competitive advantage starts.
The numbers have moved past debate. Gartner predicts 40% of enterprise applications will integrate task-specific AI agents by end of 2026, up from under 5% in 2025, and warns CIOs have a three-to-six month window to define their agentic AI strategy or risk falling behind. Microsoft telemetry shows 80% of Fortune 500 companies are already building AI agents with low-code tools. OpenClaw, the open source agentic AI framework, became the most-starred software project on GitHub in under four months. In China, Tencent ran 40-day tours across 17 cities helping people install OpenClaw on their laptops. Not engineers. Architects. Product designers. Retirees.
This is not a future capability. It is a current competitive gap. And it’s widening every week you spend debating whether your organization is “ready.”
Meanwhile, the default enterprise response is to layer human approval over every autonomous action. Review the draft. Approve the summary. Sign off on the CRM entry. That instinct comes from a reasonable place. It also reinstates the exact overhead that agentic AI was supposed to remove.
What do you do when AI makes your service, product or pricing obsolete?
May 21
+5,000 attendees
Virtual summit
We’ve written before about how enterprise structures were designed around human constraints: specialization requires coordination, coordination requires meetings, meetings require prep, and roughly 30% to 50% of work inside a large organization exists just to keep humans aligned. Agentic AI collapses that overhead. The question this post is about is what happens next.
Because the instinct, once you see that collapse, is to rebuild the overhead in a new form. A human checkpoint at every stage of an agent’s work doesn’t add quality. It preserves the friction you paid to eliminate. And that friction has a real cost: every hour your senior salesperson spends reviewing AI-drafted CRM entries is an hour they’re not closing revenue. Your product team spends a week approving outputs that are 98% accurate. Your competitor used that week to ship.
An agent can reason, plan, select tools, and iterate. What it cannot do is decide what the organization values. Your people carry decades of context about what actually serves the customer, the brand, the long-term strategy. That judgment is irreplaceable. Not because AI can’t get better at it, but because accountability for “what we stand for” should never be automated.
Klarna learned this the hard way. The company replaced 700 customer service agents with AI, claimed victory, then watched satisfaction scores collapse. CEO Sebastian Siemiatkowski admitted they “went too far.” They’re now rehiring humans for the moments that require empathy and nuance. The AI handles volume. Humans define what quality means.
Agentic AI collapses the capacity excuse. When spinning up another agent is nearly free, backlogs stop being about prioritization. They become about ambition. “What can we get done?” becomes “what should we pursue?” That’s a strategic call. It requires vision, risk appetite, and business context no model owns yet.
The Commonwealth Bank of Australia replaced 45 call center roles with AI voice bots. Customers flooded the bank with unresolved issues. Public backlash followed. Within days, the bank was rehiring staff. Financial transactions, credential access, customer-facing communications with legal exposure: these are the moments that justify a human in the loop. Place your checkpoints here. Deliberately.
There’s a necessary counterweight to this argument. Toyota, the company that pioneered factory automation, deliberately keeps automation below 8% on certain key processes. Not because the robots can’t handle it. Because removing humans entirely erodes the expertise you need when something goes wrong. The same logic applies to knowledge work. If your team never touches the RFP process because an agent handles it end to end, they lose the judgment to evaluate whether the agent’s output is actually good. Stepping out doesn’t mean disappearing. It means concentrating human involvement where it compounds expertise, not where it just confirms what a machine already got right.
Your best salesperson closes a 60-minute client call. Then spends 30 minutes on CRM updates, follow-up drafts, and research notes. An agent handles all of it in seconds.
Now add a human reviewer who approves each output before it’s logged.
That reviewer changes nothing. Approval rates on these tasks run above 95% in every sales org we’ve worked with. The reviewer isn’t catching errors. They’re adding latency. And the cost isn’t just their time. It’s the 30 minutes of selling capacity you gave back and then immediately confiscated.
Do the math. You’re paying for the AI. You’re paying for the person watching the AI. And you’re getting the same output you’d get without the watcher.
That’s not a productivity gain. That’s the status quo with a bigger invoice.
A deliberate checkpoint means you’ve mapped the workflow, identified the irreversible moments, and placed a human decision point exactly there. You’ve told the agent when to escalate and what to do when it can’t reach a human. That’s governance.
A reflexive checkpoint means you stamped “human approval required” on every step because your legal team said no to everything in 2023 and nobody revisited the policy. That’s a pilot that will never scale.
Revisit your guardrails every quarter. What felt risky six months ago may be fully solved today. Tightening and loosening controls based on evidence beats staying frozen by a policy written before agentic AI existed.
Take your top three AI-enabled workflows. For each human checkpoint in the process, ask three questions:
If the approval rate is 95% or higher, the checkpoint is confirmation theater. Remove it or convert it to a spot-check on a sample.
If the answer is “someone fixes it in two minutes,” you don’t need a gatekeeper. If the answer is “we lose a client or expose sensitive data,” keep the human. The threshold is reversibility, not perfection.
Early in deployment, human review generates feedback that improves the system. After the system stabilizes, that same review becomes pure drag. Set a sunset date for every checkpoint when you create it.
Three workflows. Three questions each. You’ll find at least one checkpoint that’s adding time, not judgment. Remove it. Measure what happens. If three workflows showed you where the waste is, redesigning how work gets done across your organization is where the real leverage sits.
If you’re figuring out where agentic AI fits in your enterprise, we should talk.
BOI helps companies map their highest-value agentic opportunities, build with the right foundations from day one, and bring their people through the transition. Learn more.
Jon leads our AI strategy practice in the US with a clear mission to help leading businesses turn AI ambition into real, scalable capability and getting to outcomes that you can hang your hat on. With a track record of shaping and driving AI transformation at companies like Cigna and Northwestern Mutual, Jon knows what it takes to scale AI.
[email protected]