Weekly Deep Dive

You can run a fleet of AI agents now. Governing them is the hard part

2026-06-08 · Unfair Advantage Editorial

A year ago, a small business ran one chatbot. Now a five-person shop might have an agent answering customer chat, another reconciling invoices, a third drafting marketing, and a fourth writing code. The New York Times spent June profiling owners managing "whole armies" of these things. The capability is real and cheap. The discipline to run them safely is not.

Here's the problem nobody budgets for. A chatbot one person uses has one risk profile. An agent wired into your files, your payment system, and your email has a completely different one — and you usually find out the hard way. One company reportedly ran up a $500 million Claude bill in a single month because nobody set a usage cap, a story Axios broke and Tom's Hardware amplified. That's an extreme case, but the mechanism is ordinary: agents loop, agents retry, agents call other agents, and the meter never sleeps. Microsoft and Uber have both reportedly pulled back Claude usage after costs got hard to justify.

Cost is only the loud failure. The quiet ones are worse. An agent with access to your customer database can leak it. An agent that can send email can be tricked into sending the wrong one. A 2026 Gravitee survey found only 24% of organizations have full visibility into which of their own agents are even talking to each other. You can't govern what you can't see — and most small teams can't see anything.

The fix isn't a better model. It's plumbing. Microsoft spent its entire Build 2026 security keynote on exactly this: shadow AI, tool sprawl, and the gap between how fast developers ship agents and how slowly anyone gains oversight. Their answer for enterprises is a wall of new tooling. Yours can be a one-page set of rules, but you do need the rules.

Think of governance as four dials, not a binder. First, a spending cap on every agent — a hard dollar ceiling per day, set in the provider's billing console, that shuts the agent off rather than letting it run wild overnight. Second, least access — each agent gets the minimum it needs and nothing more. The invoice bot doesn't need your customer list; the support bot doesn't need your bank API. Third, a human gate on anything irreversible — sending money, deleting records, emailing a client. The agent drafts; a person clicks send. Fourth, a log — a plain record of what each agent did, so when something goes sideways you can answer "what happened" in minutes, not days.

None of this requires a compliance team. It requires an afternoon and the willingness to treat an agent like a new hire you wouldn't hand the company credit card on day one. The teams that win the next year won't be the ones running the most agents. They'll be the ones who can sleep at night while the agents run.

Why it matters

Running agents is now cheap and easy; the costs and risks land on whoever forgot to set limits. A small team that puts four simple controls in place — spending caps, least access, human gates, and a log — gets the productivity without the 2am surprise invoice or the leaked customer list.

Network impact

LatencyAdding a human gate on irreversible actions adds seconds to a workflow, not minutes — a fair trade. Agent-to-agent chains, by contrast, can quietly multiply API round-trips and slow everything down; logging surfaces where that's happening.

SecurityThis is the core of the story. An agent inherits the access you give it, and a prompt-injection or a bad loop turns that access into your attack surface. Least-access scoping and a human gate on sending/deleting are the two highest-leverage controls a small team can apply.

ScalabilityGovernance is what lets you add a fifth and sixth agent without the risk compounding. Caps and logs scale linearly; running blind does not — the failure cost grows faster than the agent count.

What to do

Open your AI provider's billing console (Anthropic, OpenAI, etc.) and set a hard monthly spend cap on every API key today — an automatic shutoff, not just an email alert.
List every agent or automation you run and write one line each: what it touches (files, email, payments, customer data). If you can't list them, that's your first problem to fix.
Cut each agent's access to the minimum it needs. The invoice bot doesn't get the customer list; the support bot doesn't get the bank API.
Put a human in the loop on anything irreversible — sending money, deleting records, emailing clients. Let agents draft; require a person to approve the send.
Turn on logging so you have a plain record of what each agent did and when. Most platforms have it; switch it on before you need it.
Revisit these four controls monthly as you add agents — treat each new one like a hire you wouldn't hand the company card on day one.

Sources

« All articles