Credit-Based Pricing for AI SaaS: The Metering Playbook

Why Per-Seat Pricing Falls Apart for AI Products

Traditional SaaS prices per seat because the marginal cost of one more user is basically zero. Adding a login, a row in the users table, a session token. None of that costs you anything meaningful. So you charge $30 a seat, sell a hundred seats, and your gross margin sits north of 80%. The model works because the expensive thing is the software, and you only build that once.

AI products break that assumption completely. The expensive thing isn't the user anymore. It's the model call. The enrichment lookup. The scrape. The vector search. Every time someone clicks a button, you're paying a vendor, and that bill scales with usage, not with headcount.

Across two or three SaaS products I run, credit based pricing for AI SaaS came out of the same painful realization: per-seat math stops protecting you the moment your costs move from fixed to variable.

The expensive thing isn't the user

When the user logs in, you spend nothing. When the user runs an enrichment-heavy action, that single click can fan out to a half-dozen underlying paid API calls. The login is free. The action is not. Pricing the free thing and ignoring the expensive thing is how you end up with a healthy-looking revenue number and a margin that quietly bleeds out.

The two ways per-seat bankrupts you

Per-seat fails in both directions.

Comparison table showing how flat per-seat pricing causes light users to churn and lets heavy users torch margin, while credit-based pricing scales the bill with actual usage in both cases. Per-seat vs credit-based pricing, how flat seats bankrupt you in both directions

A 5-seat team running 200 AI actions a month is subsidizing nobody. They feel ripped off paying the same as a heavy team, and they churn.

A 5-seat team running 50,000 actions a month on a flat per-seat plan can torch your margin inside a single account. One power user whose action fans out to multiple paid calls can cost you more than they pay you. You don't find out until the vendor invoice lands.

What a Credit Actually Maps To

A credit is not an abstract token you invented to feel like a platform. A credit maps to a real, measurable unit of marginal cost. That's the whole point. If you can't tie a credit back to dollars you actually spend, you're guessing, and guessing on unit economics is how AI products die.

One credit equals one real unit of cost

Here's the framing I use. One credit equals one enriched lead. Clean, simple, something the customer values and understands.

Diagram showing how one customer-facing credit fans out into three scrapes, a model call, and an enrichment lookup totaling $0.08 in worst-case cost, priced at $0.30 to $0.40 to preserve margin. The credit fan-out: one credit maps to multiple paid sub-calls

Under the hood, that one credit might be three scrapes, one model summarization call, and one enrichment lookup against a paid data provider. The customer never sees that fan-out. They see one unit, one price. You absorb the complexity and price the bundle.

This is the difference between usage based pricing for AI that works and usage based pricing that confuses everyone. You meter the action the customer cares about, not the raw tokens they'll never understand. Nobody wants to reason about "0.0004 credits per 1,000 tokens." They want to know how many leads they can enrich.

Bundling the fan-out behind a single price

The math is straightforward once you accept it's a worst-case exercise, not an average one.

Take the worst-case underlying cost for the action. Say one enriched lead costs you about $0.08 when every sub-call fires and nothing comes back cached. Add your margin. If you want a blended margin that holds even when usage runs hot, you might price that credit at $0.30 to $0.40 and sell credits in bundles.

The key word is worst-case. If you price against your average cost, your heavy users (the ones who run the action that fans out the most) quietly erode your margin. Price against the worst case and the average users become your margin cushion, not your liability. That single decision is what keeps your ai saas unit economics from inverting at scale.

Unlimited Seats, Metered Credits: The Pricing Move That Actually Works

Once the credit carries your margin, the seat stops mattering. So give seats away. Unlimited seats on every paid tier.

This feels wrong the first time you do it. Years of SaaS instinct say seats are revenue. But seats cost you nothing in an AI product, and charging for a thing that costs you nothing is just friction with no upside.

Make the credit the only upgrade gate

When the credit balance is the single upgrade gate, your pricing aligns perfectly with the value you deliver. Customers pay for what the AI actually does. Their bill scales with their usage, which is the thing that's actually costing you money. That's the entire per seat vs usage pricing argument in one sentence: charge for the expensive thing, not the free thing.

Why unlimited seats removes friction without risk

Per-seat pricing punishes collaboration. A team that wants to bring in three more people has to justify three more line items, so they share a login instead. Now your analytics are garbage and your security team is unhappy.

Unlimited seats does the opposite. Invite the whole team. More users means more usage, more usage means more credits burned, and credits are where your money lives. Free seats are a growth lever, not a cost.

This is the same instinct behind giving the operating system away free and charging the consumer side. You make the thing that's cheap for you abundant and free, and you meter the thing that's expensive. The margin lives where the cost lives.

The Metering Plumbing: Deduct Server-Side on Every Billable Action

This is where good intentions meet reality. You can design the cleanest pricing model in the world and lose all of it to a metering bug or a clever user with browser dev tools.

Why the client can never be trusted to count

Never meter on the client. Ever. If the front end tracks the balance and the front end decides whether to deduct, then anyone who opens dev tools can spoof their balance and run unlimited free actions on your dime. The client is for display. The server is the source of truth.

Vertical flowchart showing server-side credit deduction as a single transaction (check balance, perform work, deduct credits) with rollback on failure, an append-only ledger table, and idempotency key handling. Server-side metering transaction with audit ledger and idempotency

Every billable action follows the same sequence, server-side, inside a single transaction: check balance, perform the work, deduct credits. All or nothing. If any step fails, the whole thing rolls back. You never half-charge a customer, and you never let work happen that you didn't get paid for.

Credit deduction is deterministic code, not an AI decision. The model can decide whether a lead is worth enriching, but the model never touches the balance. That separation matters. I wrote about why you let the model judge and let the code do the math for exactly this reason. Billing logic that depends on an LLM's judgment is a billing system you can't defend.

The credit-transactions audit table

Behind every deduction sits an append-only ledger. A credit-transactions table where each row records the action type, credits deducted, balance after, timestamp, and user. You never update a row. You only append.

That ledger does two jobs. When a customer disputes their usage, you have a defensible, timestamped record of exactly what happened, action by action. And because every row is tagged by action type, you get per-feature cost analytics for free. You can see which features burn the most credits and whether they're priced right.

One more thing that bites people: idempotency. Networks retry. If a request fires twice because the first response got lost, you cannot deduct twice. Tag each billable request with an idempotency key so a retry returns the original result instead of charging again. Skip this and your support inbox fills with double-charge complaints.

What Happens When the Balance Hits Zero

The zero-balance case is where a lot of products fail quietly and expensively. Handle it deliberately.

The 402 upgrade prompt

When a customer is out of credits, the server returns an HTTP 402, Payment Required. That's the status code that exists for exactly this. Alongside it, return a structured payload the front end can render as a clean upgrade prompt: current balance, what they tried to do, and a button to buy more.

Decision flowchart showing that when a user has sufficient balance the action runs and deducts credits, but when out of credits the server returns HTTP 402 and blocks the action before any paid vendor call fires. Zero-balance handling: 402 hard stop before any paid call fires

The critical detail: the action is blocked before any paid underlying call fires. You check the balance first. If they can't pay, the expensive operation never runs. You don't spend a cent on a customer who's out of credits. This is the single line of defense that stops one account from blowing your margin in an afternoon.

Don't fail silently. Don't throw a generic 500. A confused user who hits a vague error churns. A user who hits a clean "you're out of credits, here's how to get more" prompt converts.

Soft limits, hard stops, and grace

The hard stop at zero protects your margin. That's the floor and it's non-negotiable.

But you can be humane above the floor. A low-balance warning at, say, 10% remaining keeps a paying customer from getting surprised mid-workflow. Some products add a small grace buffer so a long-running batch job doesn't die at exactly zero and lose work. Those are design choices, not requirements.

The principle underneath all of it stays fixed: never run the expensive operation before you've confirmed the customer can pay for it.

Capacity-Plan the Vendor Tier So the Unit Economics Close

Here's the part most builders skip, and it's the part that actually decides whether your business survives contact with success.

Your underlying AI and enrichment vendors have their own tiers and rate limits. They are not infinite, and they are not free. Your pricing has to account for their pricing, at scale, under the worst case.

Model the full-subscriber-load worst case

Don't model average usage. Model full load. If every single subscriber maxed out their credits this month, does your vendor bill still leave you margin?

Data visualization showing the full-subscriber-load worst-case calculation versus average usage, illustrating how clustered power-user activity can double the vendor bill and invert margin if not capacity-planned. Capacity-planning the vendor tier against full subscriber load

Run the number honestly. Take your full subscriber count, multiply by the credits each could burn at their tier, fan that out to the underlying vendor calls, and price it against your vendor's actual rate card including any volume tiers or overage penalties. If that worst-case bill eats your margin, then either your per-credit price is too low or you're on the wrong vendor tier.

Most people never run this. They model average usage, the average looks fine, and then a few power users cluster their usage into the same billing period and the vendor invoice doubles.

When the vendor math forces a build decision

At volume, the vendor bill is exactly where the build vs. buy decision for AI features shows up. If a third-party enrichment provider is charging you per lookup and you're doing millions of lookups, there comes a point where owning that capability in-house changes your cost structure entirely. The vendor math forces the question.

And to meter ai usage properly, you have to treat pricing as a living number, not a spreadsheet you fill out once at launch. Vendor pricing shifts. Usage patterns shift. New models drop at a tenth the cost. Re-check the math every quarter, because the assumptions you priced on six months ago may not hold today.

Building It So You Think in Margin, Not Just Features

If you're a CEO evaluating an AI build, the thing that keeps you up isn't the demo. The demo always looks great. What keeps you up is the question nobody in the sales meeting answers: what happens to my costs when this actually gets used, and how do I price it so I don't lose money on my best customers?

The answer is that a builder worth hiring designs the metering and unit economics into the product from day one, not as a panic project after the first scary vendor invoice.

The four moves, recapped:

A credit maps to one real unit of cost, priced against the worst case
Unlimited seats, because seats are free and friction is not
Server-side deduction inside one transaction, backed by an append-only audit ledger, with idempotency
A 402 hard stop at zero so no expensive call fires for a customer who can't pay
A vendor tier capacity-planned against full subscriber load, not average

Honest note: this adds real engineering work up front. The metering layer, the ledger, the idempotency handling, the capacity model. And you have to revisit the math as usage scales, because it will drift. There's no version of this you build once and forget.

But that work is the difference between an AI product with a margin and an AI product with a leak.

Want to explore what AI could do for your business?

Book a free 30-minute strategy call. No pitch deck, no sales team. Just a real conversation about your operations and where AI fits.

Book a Discovery Call

If you're building an AI product or feature and you want pricing and metering that protects your margin instead of leaking it, that's the kind of thing I build. Have me build it with margin baked in.