Adversarial AI Review: 11 Agents Attacked My Spec (Simply Explained)

A few months ago I was building an AI system for a law firm. It would read incoming leads, figure out which cases were strong, and hand the firm a clean ranked list of who to call first.

The first version looked great. Smart, even.

That is exactly what worried me.

Why a Good-Looking Plan Scares Me

When you work in a regulated business like law, a clever design hides its mistakes until a real person gets hurt. And by then it is too late.

Picture this. A client with a strong case gets a low score and never gets a callback. The clock runs out on their claim. The firm faces an ethics complaint. You cannot undo that by fixing the software later. There is no do-over with a real client and a real deadline.

So before I wrote a single line of code, I put the design itself on trial.

I ran what I call an adversarial AI review. Think of it like a courtroom. Instead of one assistant telling me "looks good," I assembled a team of eleven AI specialists, each with one job: break the plan from a different angle. Find the worst thing it could do.

They came back with 33 problems.

Some were noise. But a few would have quietly destroyed real people's cases, and they never would have shown up in normal testing. They were not bugs in the code. They were bad decisions baked into the plan that looked perfectly reasonable on paper.

The best part: I caught them while the plan was still just a document. No code existed yet. So every fix was free.

Four Problems That Looked Fine and Were Anything But

Most people think getting AI feedback means asking one program "is this good?" That gets you a thumbs up and some vague suggestions. Useless.

A real review is a panel. Each AI specialist gets a specific lens. One looks at fairness. One looks at legal ethics. One checks whether the system is pretending to be precise when it is really just guessing. Their job is to attack, not to bless.

Here are four problems they caught.

The single score that quietly buried clients. The plan boiled every lead down to one number. Clean and simple. The trouble is that number would have buried entire groups of people on weak signals. A bad phone connection. A client who is nervous and not articulate. All of it crushed into a low score, and a low score meant nobody called back. The firm would never even see the clients it dropped.

The forced-citation trap. The plan required the AI to cite a legal source on every answer. Sounds rigorous. It is the opposite. When you force an AI to produce a citation it does not actually have, it makes one up. You end up with confident, fake case names and statute numbers. The rule meant to add accuracy was a lie factory.

The word that created a legal duty. The plan promised prospects a "confidential review." In a legal setting, that word is not marketing. It can create a duty to the client the firm cannot actually honor for everyone who fills out a form. Good-looking sales copy, real liability.

Guesses dressed up as facts. The AI was handing the system legal deadline values as if they were solid facts pulled from a database. They were guesses. The system would then make real decisions on those invented numbers, with false confidence.

None of these show up in normal testing. They are not glitches. They are sentences and decisions that look completely fine until a real client is on the other side.

Turning Problems Into Hard Rules

A problem you spot is worthless if it stays a suggestion. Every real one became a hard rule built into the plan.

We stopped pretending the AI's scores were exact. The AI judges whether a case has merit. Any actual math runs in plain code. The AI is a judgment engine, not a calculator, so I stopped asking it to do arithmetic.

We made citations optional, and any source the AI does offer gets checked against a real database before anyone sees it. If it does not check out, it gets dropped. The AI can suggest. It cannot invent a law that does not exist.

We split "is this case strong" from "what is it worth." Two separate questions. A strong case with a small payout should not get buried. This forced the firm to make that tradeoff on purpose instead of letting the software make it invisibly.

And when the facts are unconfirmed, which is most of the time at intake, the system gives an honest wide range instead of a fake precise number. Not "$48,000." A range wide enough to admit how little it actually knows yet.

Here is the whole point. An AI with no rules is a liability. An AI fenced in by rules earned through this kind of review is a tool you can actually trust.

This Is a Standard Step, Not a Lucky Catch

This was not luck. I now run this before any AI touches a regulated client.

The sequence is always the same. Decide what could go wrong in that specific industry. Assign one AI specialist to each danger. Set them loose on the plan. Turn what they find into hard rules. Then, and only then, build.

The cost is a few hours and a little spend. The alternative is finding the landmine after a client is harmed, when fixing it means lawyers, not a quick edit.

I will be honest about the limits. The AI specialists do not catch everything. Some of those 33 findings were noise, and a human still has to decide which ones actually matter. The panel will happily raise a confident objection to a non-problem, because that is what I asked it to do.

So this does not replace judgment. It feeds it. The specialists surface candidates. I decide which are real.

But here is what it reliably does that nothing else does. It finds the plan-level mistakes that normal testing never sees. A normal reviewer checks whether the code does what the plan says. They do not question whether the plan itself is unfair or crosses an ethics line. That problem is invisible at the code level.

If You Are About to Put AI Somewhere It Can Hurt You

If you run a business in law, health, finance, or compliance, you already know one bad AI output can be unrecoverable. That is not a bug ticket. It is an incident.

Avoiding AI entirely is the wrong move. So is trusting a vendor's demo, which is built to show you the happy path and hide exactly the failures this review is designed to expose.

The right answer is to attack the design before it goes live. A few hours of structured pressure turns a clever-looking plan into one you can actually defend when something goes wrong.

That is how I ship every regulated build. The review comes before the build, not after the incident. That ordering is the whole game.

Ready to bring AI leadership into your company?

I work with a small number of companies at a time. If you're serious about AI, apply to work together and I'll review your application personally.

Apply to Work Together