Back to Blog
compliancegovernancehuman-in-the-loopai-agentsfinancial-services

Human-in-the-Loop AI Compliance: Who Approves the Edits

I let AI auto-apply 32 files of compliant copy edits, then reverted all of it. Here's the human-in-the-loop AI compliance workflow that survives an examiner.

By Mike Hodgen

Short on time? Read the simplified version

The Mistake: I Shipped 32 Files of AI Edits in One Pass

I run a financial advisory firm's content through a pretty simple test: would a licensed registered principal stand behind every word if an examiner came calling. At one client, that principal is personally, legally accountable for every public-facing communication the firm publishes. Not the marketing team. Not me. Him.

So when the firm needed compliant rewrites across 32 files of public copy, I did what felt efficient. I had an AI read every page, draft minimal compliant rewrites, and I shipped all 32 at once.

Here is the uncomfortable part. The edits were good. The model caught language that needed softening, removed implied guarantees, tightened claims that could have drawn a flag. Technically, the output was clean.

The process was indefensible.

I had let an AI auto-apply changes to regulated content that a licensed human was supposed to review and approve, one by one, before any of it went live. The words were correct. The workflow was wrong. And in a regulated firm, the workflow is the thing that gets you sanctioned.

This is the core of human in the loop AI compliance, and it is what this article is about. Not whether AI can write compliant copy. It can. The question is who approves the edits, and whether you can prove it.

That mistake was mine. Not the model's. The AI did exactly what I asked. I asked the wrong thing.

What follows is how I caught it, why I reverted all 32 files, and how I rebuilt the whole thing into something that survives an audit instead of triggering one.

Why I Reverted Every Single Change

The right call was to back all of it out. Every file. Restore the originals and start over.

That feels wasteful when the edits were good. It is not. In a regulated firm, good copy applied through a bad process is a liability, not an asset.

The model didn't make the edits accountable

Here is the accountability chain that matters. When an examiner reviews public communications, they do not ask "did the AI get the language right." They ask "who reviewed and approved this communication, and when."

If the honest answer is "a consultant let an AI auto-apply edits to 32 files in one pass," you have already failed. Nobody even needs to read the copy. The approval step is missing, and the approval step is the entire point of having a licensed principal.

The model is not a licensed person. It cannot sign off on a public communication. It has no standing in front of a regulator. So a workflow where the model's output goes live without a named human approval is a workflow with a hole in exactly the spot examiners look first.

The consultant doesn't get final say either

I want to be blunt about my own role, because this is the part buyers get wrong. I am the AI consultant. I built the system. I do not get final say on regulated copy either.

Accountability chain showing AI drafts, consultant builds, and only the licensed principal approves regulated content AI Proposes, Human Disposes Accountability Chain

Only the licensed registered principal does. Not me, not the model, not the marketing director. The accountable party is the human whose license is on the line.

This is the principle behind every AI system I ship stops for a human. AI changes nothing about who is liable. It only changes how fast the work gets prepared for the person who is.

That is the reframe. AI is not the approver. It is the thing that makes the approver's job faster. Once I got that straight, the rebuild was obvious.

Rebuilding It as a Review Queue, Not an Auto-Apply Script

The original version was a script that read a file, generated a rewrite, and wrote it back. Fast, clean, completely wrong for this context.

The rebuild turned every proposed edit into a row in a review queue. Nothing got applied. Everything got proposed and parked, waiting for one human to act on it.

Each proposed edit becomes one reviewable row

Each row holds four things, conceptually: the original text, the proposed replacement, the exact location on the live page, and a status. That status starts as "pending" and stays there until the principal does something with it.

No file path or repo name ever shows up in front of the principal. He should not have to care where the content lives in code. He should only see the words, the proposed change, and where it appears on the public site.

The structure matters because it breaks the all-or-nothing problem. With 32 files auto-applied, there is one decision: ship or don't. With a review queue, there might be 80 individual edits, and each one is its own decision with its own audit record.

Deep-linked to the exact spot on the live page

This is the part I am proudest of. Each row deep-links to the precise spot on the live site using a browser text-fragment highlight. The principal clicks, the live page opens, scrolls to the exact sentence, and highlights it.

He does not read a diff in a developer tool he has never used. He sees the copy the way the public sees it, in context, on the real page.

That context is not a nicety. A registered principal reviewing public communications needs to evaluate how language reads to a customer, not how it looks as a code change. Context is what makes his review a real review instead of rubber-stamping a list.

This is the pattern where the AI wrote the fixes, the human approved every one. AI proposes. The human disposes. The queue is just the machine that enforces that order.

Confirm, Modify, or Skip: The Only Three Controls That Matter

Every row in the queue has exactly three controls. I resisted the urge to add more, because more controls mean more confusion and a slower review.

Diagram of a review queue row showing original text, proposed edit, location, status, and the three controls confirm, modify, skip Review Queue Row Anatomy with Three Controls

Confirm. The principal approves the proposed edit as written. The AI's wording stands, but now it stands because a licensed human chose it.

Modify. The principal rewrites the wording himself. The AI's draft is just a starting point. He changes a word, a phrase, a whole sentence, and his version replaces the proposal.

Skip. Reject the edit outright or defer it. The original text stays, and the row is marked so it never sneaks into the final apply.

Every action auto-saves the instant he takes it. No save button, no lost work if his session drops mid-review. He can close the tab, come back tomorrow, and pick up exactly where he left off.

Modify is the most important control, and most AI tools do not have it.

Think about what Confirm and Skip alone give you: approve or veto. The AI still wrote every word that ships. The human only gets to say yes or no to the machine's language.

In a regulated context that is not enough. The accountable human needs to be able to overwrite the AI, not just approve or veto it. When he uses Modify, the final words are genuinely his. They came out of his judgment, his license, his accountability.

That is the line between a tool that drafts and a tool that decides. A drafting tool hands a licensed human a starting point he can change. A deciding tool hands him a finished decision and asks him to bless it. Only one of those survives a regulator asking "whose words are these."

In regulated content, the human must be able to overwrite the AI, not just approve or veto it. Build that, or you have built a rubber stamp with extra steps.

Nothing Touches the Live Site Until Everything Is Approved

The live site is the protected resource. The only path to it runs through full human approval. That is the design intent behind the final gate.

The apply script only restores fully-confirmed files

There is a separate apply script, and it has one hard rule. It only touches a file when every single edit in that file is confirmed. One skipped edit, one pending row, one rejected proposal in a file, and that whole file stays untouched.

Decision tree showing the apply script only updates a file when every edit is confirmed, otherwise the file stays untouched The Apply Gate: Fully-Confirmed Files Only

This kills the half-applied state, which is the dangerous one. Imagine a page where three edits go live but two were still under review. Now the page is a mix of approved and unapproved language, and nobody can cleanly say what state it is in. That is exactly the ambiguity an examiner pulls on.

By gating on fully-confirmed files only, the live site is always in a known state. Either a page reflects fully approved copy, or it reflects the original. Never a mongrel of both.

This is one of the kill-switches I build into every system. The default is that nothing ships. Action moves toward live, never away from a human checkpoint.

A completion-notify cron tells me when the queue is clear

I do not want to ping the principal asking if he is done. I do not want to guess. So a completion-notify cron checks the queue and emails me the moment he has cleared every row.

When that email lands, I know the review is genuinely finished and the apply script can run against fully-confirmed files. No nagging, no manual checking, no assumption.

This is the part that survives an examiner. Every row carries who acted on it, what they chose, and when. Confirm, modify, or skip, all timestamped, all attributable to the licensed human.

A clean audit trail of who approved what, when, is not a feature you bolt on after a regulator asks. It is the thing the whole system exists to produce.

What This Pattern Costs vs. What It Protects

I will be honest about the tradeoff. A review queue is slower than auto-apply. The principal has to actually sit down and review every row, and with 80-plus edits that is real time out of his week.

Comparison matrix of without AI, auto-apply, and review queue across speed, human effort, audit trail, and examiner survivability Three Workflow Approaches Compared: Cost vs Defensibility

There is no way around that, and I would distrust anyone who told you otherwise.

But look at the math. The AI did the slow, tedious part: reading every page, drafting minimal compliant rewrites, and locating the exact spot for each one. That is hours of work it compressed into minutes. The human does only the part that legally must be human, which is the approval.

Run the alternatives. Without AI, the principal finds and rewrites everything himself, which is hours he does not have and probably will not do well. With auto-apply, you save his time but lose every shred of defensibility, which is what I learned the hard way.

The queue is the only version that is both fast and survivable. The AI absorbs the volume. The human keeps the decision. Neither side does the other's job.

This is the same logic I use across shipping AI content in a regulated industry. Speed and compliance are not opposites once you put the human at the right point in the pipeline instead of in front of every keystroke.

Here is what still does not work, and I want to say it plainly. This depends entirely on the principal actually doing the reviews. No tool removes that. If he rubber-stamps 80 rows in four minutes without reading them, the audit trail is clean but the review was theater.

Any vendor claiming their AI removes the human approval requirement in a regulated industry is not selling you efficiency. They are selling you liability with a nice interface.

The Question Every Regulated CEO Should Ask a Vendor

Let me resolve the doubt I opened with. If AI edits your regulated content, the accountable party is your licensed human, every time. That does not change. What a serious AI build changes is how easy it is to prove that accountability, not whether you have it.

Infographic of the vendor test question with good build versus bad build responses for regulated AI content approval The Vendor Test Question

A bad build buries the approval step or skips it. A good build makes the approval the centerpiece and produces the audit trail as a byproduct.

So here is a test you can run on any AI vendor, today, before you sign anything. Ask them: "Where does my licensed principal approve each change, and can you show me the audit trail."

Watch what happens. If they have a real answer, a place where the accountable human confirms or modifies each edit, and a record of who did what when, they understand your business. If they wave it off, talk about how the AI is so accurate you won't need review, or describe a batch process that applies everything at once, they are shipping you risk and calling it speed.

This is how I build for regulated firms. The AI does the volume. The human owns the decision. The system exists to make that decision fast to prepare and easy to prove.

If you are in a regulated industry and your board is asking about AI, this is the kind of workflow that keeps you out of trouble while still getting the speed. The two are not in conflict when the human sits at the right gate.

Want to explore what AI could do for your business?

Book a free 30-minute strategy call. No pitch deck, no sales team, just a real conversation about your operations and where AI fits.

Book a Discovery Call

Get AI insights for business leaders

Practical AI strategy from someone who built the systems — not just studied them. No spam, no fluff.

Ready to automate your growth?

Book a free 30-minute strategy call with Hodgen.AI.

Book a Strategy Call