False green: the scariest bug in a code scanner

A scanner that crashes is annoying. A scanner that reports '0 problems' when it actually did nothing is dangerous — it manufactures false confidence. Three times this year, ik silently skipped a check and called it a pass. Here's how each one happened, and the rule we now scan ourselves by.

A code scanner has one job: tell you the truth about your repository. The worst way it can fail isn’t crashing. It’s saying “0 problems found” when it actually ran nothing at all.

A crash you notice. You re-run it, you file a bug. But a green badge that should be red is invisible — it doesn’t ask for attention, it removes it. It tells you the thing you most wanted to hear, and you move on. False green doesn’t just fail to find problems; it manufactures confidence that there aren’t any.

Three times this year, ik did exactly that. Each one is a slightly different flavor of the same trap, and together they rewrote how we test.

1. The secret scanner that found nothing on a Mac

We were re-running case studies and noticed two repositories came back with 0 secrets. Running gitleaks directly on the same repos found six to twelve leaks each. The scanner wasn’t finding zero secrets — it was finding zero, full stop.

Root cause: our secrets check passed gitleaks --report-path /dev/stdout and read the pipe. On Linux, /dev/stdout is file descriptor 1 and that works fine. On macOS, open("/dev/stdout", O_WRONLY) returns EACCES — gitleaks logs a fatal error and exits, our subprocess wrapper sees empty stdout, the parser returns nil, and an empty result list is, by default, a pass. Green badge.

The detail that makes it worse: the bug split across gitleaks versions. The managed 8.21.2 opens the report file only after scanning, so it found the secrets, then failed at write time and reported zero. A newer 8.30.1 pre-flights the path and bailed in 50 ms. Same root cause, two timing signatures — and every macOS-local scan since the secrets check shipped had been reporting zero, no matter what was in the repo.

The fix is one line of plumbing (write to a temp file, read it back). The lesson is the regression test: a fake gitleaks shell script that writes to whatever --report-path it’s handed, so we can never quietly go back to piping through a path that doesn’t exist everywhere.

2. PMD 7 deleted a renderer, and four Java checks went quiet

We pre-baked PMD 7 into our scanner image and shipped it. The next morning, every Java scan in production was under-counting findings — and four checks (magic-numbers, complexity, dead-code, error-handling) were the culprits.

PMD 7 removed the checkstyle output renderer. Every PMD invocation in our codebase asked for --format checkstyle, which now errors with “Can’t find the custom format checkstyle,” exits, and leaves empty stdout. Our parser saw nothing and reported nothing. The exact same failure mode as the gitleaks bug: an upstream tool fails loudly, our pipeline swallows it quietly.

It got through review because two of the four checks had no tests at all — and pre-deploy the bug was latent, since PMD wasn’t on the old image, so that code path was never exercised in CI. The moment we baked PMD in, the latent bug became a live silent regression on every Java scan.

The fix switched all four to PMD’s stable native XML renderer. But the part that matters is what we added alongside it: smoke tests on the two checks that had none. A check with no test is a check that can go silent without anyone noticing.

3. 790 of 791 Rust repos reported zero dependencies

This one we caught in our own dataset. Across an admin scan batch, 790 of 791 Rust repositories logged zero dependencies from cargo-audit — and, crucially, there were no entries in the error log. Not “the tool crashed.” The tool was never invoked.

Two stacked causes. First, cargo was never installed in our scanner image, so LookPath("cargo") failed and the code routed to a “missing tool” branch — silently, because nothing ran. Second, even with cargo present, the check returned early when Cargo.lock was absent, which is common for library crates. Two different ways to do nothing, both invisible.

The fix added an osv-scanner fallback (already baked in for our Java path) and — the load-bearing change — made every skip record itself in the error log. The invisible-99% case became a visible, queryable row. You can’t fix a silent skip you can’t see.

The rule we scan ourselves by

Three incidents, one shape: an external tool fails, our pipeline interprets “no output” as “no findings,” and a skipped check wears a passing badge.

So we changed the default. A check that couldn’t run is no longer the same as a check that ran and found nothing. Skips get surfaced — to the user as a line in the output, to us as a row we can query across the whole dataset. And the deeper answer is to stop shelling out at all where we can: it’s the same reason we moved our complexity analysis in-process and run a local model in the binary instead of calling an API. A check compiled into the scanner can’t be missing from the box.

A scanner you can trust on a machine you didn’t set up is mostly a scanner that refuses to fake a green. That’s the bar.

Want a scanner that tells you when it didn’t run something? Install the CLI — one line, about a minute, no account required.

False green: the scariest bug in a code scanner

1. The secret scanner that found nothing on a Mac

2. PMD 7 deleted a renderer, and four Java checks went quiet

3. 790 of 791 Rust repos reported zero dependencies

The rule we scan ourselves by

Know what you shipped.