Agent authority levels: a CFO/COO ladder for autonomous work (with caps + guardrails)

Executive summary

Most AI programs stall at the same moment:

The moment the agent can do things, not just suggest things.

Executives don’t block automation because they dislike AI. They block it because the organization can’t answer basic control questions:

What exactly is the agent allowed to do?
Where are the approval gates?
What are the caps (cost, volume, spend, blast radius)?
If something goes wrong, can we audit and roll back?

A useful mental model for CFOs and COOs is to treat agent permissions like you treat financial controls: authority levels.

This post introduces a simple ladder you can use to approve agent workflows quickly while keeping risk bounded:

Read-only (observe)
Draft (propose)
Execute-with-approval (human gate)
Execute-with-caps (autonomous within strict limits)

Then we’ll cover the minimum guardrails that make each level safe:

run receipts (auditability)
verification (pre-flight checks)
caps (spend/volume/time)
reconciliation (post-flight checks)
kill switches + rollback

If you install this ladder, you can expand automation without creating an ungoverned “shadow operations” layer.

Why “authority” is the real problem (not model quality)

Most teams talk about agent risk like it’s a model problem:

hallucinations
safety filters
prompt injection

Those are real issues. But they’re not the main reason AI agents fail inside businesses.

The main failure is unclear authority.

If your policy is “the agent can do stuff,” you’ve created a new worker with:

unclear job description
unclear limits
unclear accountability

That’s not an AI problem. It’s an operating system problem.

The Agent Authority Ladder (4 levels)

Level 1 — Read-only (Observe)

What it can do

read data from systems of record (CRM, ERP, ticketing, data warehouse)
retrieve and summarize documents (SOPs, contracts, policies)
generate dashboards, narratives, variance explanations

What it cannot do

change records
send messages externally
trigger workflows

Why this level is valuable

Read-only agents produce leverage immediately:

faster reporting
quicker answers to operational questions
fewer ad-hoc data pulls

And they’re easy to approve because they don’t change reality.

Minimum guardrails

least-privilege access (service accounts, scoped APIs)
logging of what was accessed (records, tables, docs)
redaction rules for sensitive fields

Good first workflows

“Weekly pipeline + forecast narrative” (reads CRM + finance assumptions)
“Invoice exception triage summary” (reads AR aging + notes)
“Customer support root-cause summary” (reads tickets + tags)

Level 2 — Draft (Propose)

What it can do

generate drafts: emails, tickets, invoices, follow-ups, SOP updates
generate structured plans: “here are the steps I would take”
generate suggested record updates (but not apply them)

What it cannot do

send or execute automatically

Why this level is valuable

Drafting compresses cycle time and reduces repetitive work, while keeping a human in the loop.

This is often the fastest path to measurable productivity.

Minimum guardrails

drafts are clearly labeled as drafts
citations/links for any factual claims pulled from internal sources
a “review checklist” next to the draft (what the human must verify)

Good workflows

“Draft customer renewal email using account history”
“Draft 3 collections follow-ups with escalating tone”
“Draft the QBR deck outline from last month’s metrics”

Level 3 — Execute-with-approval (Human gate)

What it can do

propose an action plan
pass verification checks
execute only after a human approves

Execution examples:

update CRM fields
open/close tickets
schedule meetings
send customer emails
submit vendor forms

Why this level is valuable

Level 3 is where AI starts to create the “headcount efficiency” CFOs actually care about, because work is not just drafted — it gets completed.

It also creates the fastest path to safe learning: you see real outcomes while still controlling risk.

Minimum guardrails

pre-flight verification (schema checks, numeric checks, allowlists)
run receipts (what it did, with what inputs, and why)
approval gate tied to a real identity (who approved, when)
limited batch size (e.g., max 10 actions per run)

Good workflows

“Create Jira tickets from a support incident summary (Ops approval)”
“Update pipeline stage + next steps (Sales Ops approval)”
“Send invoice reminders (AR approval)”

Level 4 — Execute-with-caps (Autonomous within strict limits)

What it can do

execute without human approval within caps

This is the equivalent of delegating to a trusted operator with explicit limits.

Why this level is valuable

It’s the level where automation becomes a true operating advantage — but only if risk is bounded.

The key is that Level 4 is not “unlimited autonomy.”

Level 4 is autonomy with caps.

The caps that make Level 4 safe (a CFO/COO checklist)

Think of caps as “blast radius limits.” You can mix and match.

1) Spend / money caps

max $X per day per vendor
max $Y per invoice credit/adjustment
refunds allowed only under $Z and only for predefined reasons

2) Volume caps

max N emails per hour
max N CRM updates per run
max N tickets created per day

3) Scope caps (where it can act)

only these customer segments
only these regions
only these products
only these ticket categories
only these CRM pipelines

4) Time caps

runs only during business hours
must stop if execution exceeds T minutes

5) Confidence / certainty caps

only execute when data quality checks pass
only execute when required fields are present
only execute when a second “critic” pass flags no policy issues

6) Change caps (reversibility)

only reversible actions
only idempotent API calls
write changes as a “pending state” first, then finalize after reconciliation

7) Cost caps (AI usage)

token budget per run
max retries
circuit breaker if cost/run spikes

If you can’t define caps, you’re not ready for Level 4.

The guardrail bundle: propose → verify → execute → reconcile

A reliable agent workflow looks like payments, not chat.

Step 1: Propose

The agent should explicitly list:

the actions it intends to take
the records it will touch
any assumptions

Step 2: Verify (pre-flight)

Verification can be rules-based, model-based, or both:

schema validation (“all required fields present”)
numeric checks (“amounts sum correctly; no negative totals”)
allowlists (“only these domains can receive emails”)
policy checks (“no PII in outbound message”)

Step 3: Execute

Execution should be:

small batches
safe defaults
reversible when possible

Step 4: Reconcile (post-flight)

Reconciliation is what makes automation finance-grade.

Examples:

after updating the CRM, re-read the record and confirm the fields match intent
after sending emails, check delivery/bounce and create exceptions for failures
after creating tickets, confirm they landed in the right project/queue

Run receipts: the audit log you’ll wish you had later

Every Level 3–4 workflow should emit a run receipt:

who/what initiated the run
workflow version (prompt/tool config)
inputs (or pointers/hashes)
records accessed
actions executed (with IDs)
approvals (if any)
exceptions + escalations
reconciliation results

This is the difference between “we tried agents” and “we operate agents.”

A pragmatic rollout plan (30 days)

You can implement the ladder quickly.

Week 1: Pick one workflow + set Level 1 access

choose something high-volume and measurable
implement read-only data access + logging

Week 2: Add Level 2 drafts

drafts + review checklist
start capturing run receipts (even if minimal)

Week 3: Add Level 3 execution with approval

pre-flight verification
explicit approval gate
small batch execution

Week 4: Promote one slice to Level 4 with caps

define caps
add circuit breakers
add reconciliation + exception queue

At the end of 30 days, you should have at least one workflow that is measurably faster and provably controlled.

Closing thought

If you want AI agents to create real operating leverage, don’t argue about “autonomy” in the abstract.

Define authority like you define financial controls:

levels
caps
audit logs
reconciliation

That’s how you scale automation without scaling risk.

Executive summary

Why “authority” is the real problem (not model quality)

The Agent Authority Ladder (4 levels)

Level 1 — Read-only (Observe)

Level 2 — Draft (Propose)

Level 3 — Execute-with-approval (Human gate)

Level 4 — Execute-with-caps (Autonomous within strict limits)

The caps that make Level 4 safe (a CFO/COO checklist)

1) Spend / money caps

2) Volume caps

3) Scope caps (where it can act)

4) Time caps

5) Confidence / certainty caps

6) Change caps (reversibility)

7) Cost caps (AI usage)

The guardrail bundle: propose → verify → execute → reconcile

Step 1: Propose

Step 2: Verify (pre-flight)

Step 3: Execute

Step 4: Reconcile (post-flight)

Run receipts: the audit log you’ll wish you had later

A pragmatic rollout plan (30 days)

Closing thought

Keep exploring the work behind the insight.

AI enablement

Technical delivery

Case studies

Want more operator insights?