The Domain Knowledge Software Bug That Hid for Months (Simply Explained)

The Order That Didn't Add Up

The owner of a window-blinds factory was looking over a single order when something felt off. The count of one part was short. Just one piece missing on one product. Most people would shrug and move on.

He didn't. And that one missing part is how we found one of the most expensive software bugs I've ever traced.

Here's what makes this story worth telling. There was no crash. No error message. No red warning light anywhere. The software ran perfectly. It produced clean, confident, completely wrong numbers, and it had been doing it for months.

Every order for a certain kind of product had been quietly shorting parts. And nobody had noticed.

How the Software Got Fooled

To understand the bug, you have to understand the product. Not the code. The product.

In window blinds, there's a thing called a "dual shade." It's two separate shades built into one unit. Think of it like a sandwich: one layer blocks all the light, the other softens it, both built into the same frame. The customer sees one window covering. The factory has to build two complete shades.

That difference is everything.

The quoting paperwork listed each dual shade as a single line item. One line on the spreadsheet. To the software reading that spreadsheet, one line meant one shade. But in the real world, that one line meant two shades.

The software had no reason to doubt it. One line, one shade. That's a perfectly reasonable assumption for anyone who doesn't build window blinds for a living.

But every dual shade needs two of everything. Two sets of rails, tubes, brackets, controls, motors. So when the software saw one line and counted parts for one shade, it ordered exactly half the parts the factory actually needed.

The math was correct. The software did exactly what it was told. The problem was that what it had been told didn't match how the product works in real life.

The Second Bug Hiding Underneath

Once we started pulling the thread, a second problem showed up. This one was arguably worse.

The second shade's fabric was vanishing completely.

The paperwork actually had two fabric entries, one for each shade in the dual unit. But when the software read the file and saw two entries with the same label, it did what most software does. It kept the first one and quietly threw away the second. No warning. It just deleted the second shade's fabric and moved on.

So now you had two silent errors stacked on top of each other. Half the hardware. And a missing fabric.

Both invisible. Because the order still looked complete.

This is the part I want every business owner to sit with. A blank field would have been caught. Someone reviewing the order would have seen an empty box and asked what happened. But the order wasn't blank. It had a fabric. It had a parts count. Everything was filled in. Everything looked done.

Plausible-looking numbers are what slip through. Software that looks like it works is far more dangerous than software that obviously doesn't. The broken kind gets fixed in an afternoon. The "looks fine" kind quietly ships wrong orders for months.

Why Nobody Caught It

This bug had no symptoms. None.

It didn't crash. It didn't show an error. It produced a real number that happened to be exactly half of what it should have been. From the software's point of view, nothing went wrong.

You can't set up an alarm for a problem that produces normal-looking results. The error rate stays at zero. Everything stays green. The system reports perfect health while quietly shipping wrong orders.

And testing wouldn't catch it either. Most orders were the simple single-shade kind, and those were perfectly correct. One line, one shade. The assumption held. So every check on a normal order passed.

Only the dual orders were wrong, and they were wrong by an amount that looked believable. Half the parts on a dual shade still looks like a reasonable number if you don't know the order is supposed to have two shades' worth.

Standard testing only checks that the software does what it was designed to do. This software did exactly what it was designed to do. The flaw was in the design's belief about the product, and no amount of testing the belief against itself would expose it.

The Fix Was Small. Finding It Was the Hard Part.

The fix itself took an afternoon. That's almost always true with these bugs.

I added a simple rule: a dual shade counts as two physical shades, a normal one counts as one. Then I applied that rule everywhere the software added up parts. Not just the one part that tipped us off. Every count. Now every part traces back to a real, physical shade.

The second fix was rescuing the lost fabric. Instead of letting the software throw away the duplicate, I told it to read both entries and assign the second fabric to the second shade. The fabric that had been disappearing for months was finally captured.

Then the part that matters most long term. I didn't just patch it. I wrote the rule down inside the system, in plain terms, so the next person (or the next AI) building a new feature can't accidentally forget it and bring the bug back.

Business rules drift over time. Someone adds a feature and the unwritten assumption is gone. So I built the rule into the foundation instead of hoping everyone remembers it.

Who Catches the Bug When AI Builds the Software

Here's the question every owner should be asking right now. If AI writes my software, who catches the subtle bugs only someone who knows my business would notice?

I'll give you the honest answer, because the honest answer is the whole point.

AI would not have caught this on its own. Neither would a typical developer who didn't understand window blinds. The AI built the software correctly against a reasonable assumption: one line, one shade. Any good developer would have written it the same way.

The assumption was wrong because nobody had taught the software what a dual shade really is. The code wasn't broken. The premise was. And that premise came from knowing the product, not from clean coding.

The catch came from a human who builds these things and knew a part count was off. Then it took someone who could connect that one missing part to the deeper data problem and fix it everywhere.

That's two different kinds of knowing. Knowing the business, and knowing the system. The bug only gets caught where those two meet.

AI helps me build faster than I ever could alone. But speed doesn't help if the foundation is built on a wrong belief about your business. You still need someone who treats your business as the source of truth and checks the software against reality, not against its own assumptions.

That's why I build the systems myself instead of just advising from the sidelines. You can't catch a bug like this from a slide deck. You catch it by being close to the product and the data at the same time.

Every automated system you run makes assumptions about your business that nobody wrote down. Most are right. The ones that are wrong don't fail loudly. They produce confident, plausible, wrong output for months until someone who knows the business looks twice and counts the parts.

This was a window-blinds factory. But the pattern is everywhere. A pricing system that assumes every product behaves the same. An inventory count that tracks the wrong unit. Every one of them can run silently wrong while every dashboard says fine.

Ready to bring AI leadership into your company?

I work with a small number of companies at a time. If you're serious about AI, apply to work together and I'll review your application personally.

Apply to Work Together