All posts
May 12, 2026 · inkode team

What 5,299 scans taught us about AI-written code

Three weeks ago we shared results from 1,961 scans. We've now scanned 5,299. The bigger dataset didn't weaken the signal — it made it clearer. AI-marked repos commit secrets at 3.2× the rate. After adjusting for size, they still score 8–11 points lower. Some earlier claims held up. A few changed.

Three weeks ago, we shared results from 1,961 repository scans. Now we’ve scanned 5,299.

The bigger dataset did not weaken the signal. It made it clearer.

AI-marked repositories still commit secrets much more often than non-AI repos. After adjusting for project size, they still score lower on average. And one surprising result stayed almost exactly the same: simply detecting AI usage is one of the strongest indicators of a lower code quality score.

Some earlier findings held up. Some changed once the sample got larger. A few things turned out to be wrong.

This post covers all of that.


The short version

From 5,299 repositories:

  • 19.9% contained at least one committed secret
  • AI-marked repos committed secrets 3.2× more often
  • AI-marked repos scored 8–11 points lower than non-AI repos of the same size
  • The ai-stack signal became the second-strongest predictor of a low score

At the same time, a few earlier conclusions no longer hold. Some language-specific trends flattened out, and one repo-age claim turned out to be based on bad data.


How the dataset works

inkode is a command-line tool that scans Git repositories.

It runs 20 separate checks and produces:

  • a score from 0–100
  • an A–F grade
  • detailed findings

The dataset includes 5,299 unique repositories.

We split them into two groups:

GroupRepositories
AI-marked1,007
Non-AI4,282

A repository was considered “AI-marked” if it contained signs of AI assistant usage, including:

  • .cursor/
  • .claude/
  • CLAUDE.md
  • AI SDK dependencies
  • commit trailers like Co-Authored-By: Copilot

One new detail stood out this time:

84.5% of AI-marked repos contained at least one explicit AI commit trailer.

That matters because commit trailers are a much cleaner signal than config folders or dependencies. Someone — or some tool — intentionally added them.

The dataset also shifted over time. The earlier sample leaned heavily toward Rust projects. The larger sample is now mostly Go repositories.

That changed some of the language-level trends, but not the core AI vs non-AI comparisons.


Claim 1 — Almost 1 in 5 repos contain committed secrets

Across all scanned repositories:

19.9% had at least one committed secret

That includes things like:

  • API keys
  • passwords
  • access tokens
  • credentials inside config files

Example findings looked like this:

.env.example:5    AWS access key
src/config.js:12  hardcoded API key
Dockerfile:8      DB password in ENV line

This number is lower than the 23.1% we reported in the smaller sample.

The earlier dataset had more small internal or admin-style projects, where this problem appeared more often.

But the AI vs non-AI gap moved in the opposite direction.


Claim 2 — AI-marked repos commit secrets 3.2× more often

This is still the clearest result in the dataset.

Repository typeSecret rate
AI-marked44.5%
Non-AI14.1%

That is a 3.2× difference.

In the earlier dataset, the gap was 2.7×. Instead of shrinking as the sample grew, it became larger.

That surprised us.


Why project size does not fully explain this

This metric is binary:

Does the repository contain at least one committed secret?

Larger repositories do have more files, but that alone does not automatically create a “yes”.

Many large repositories had zero secrets. Many tiny AI-marked repos had several.

The pattern appears across all repository sizes.


Why this probably happens

The explanation is usually simple.

AI assistants do not know which strings are sensitive unless the user explicitly tells them.

If a prompt contains:

Use Stripe API key sk_live_abc123

the assistant may paste that value directly into source code.

An experienced engineer would normally:

  • move the key into .env
  • add .env to .gitignore
  • commit a safe .env.example

AI tools can do that too — but only if the prompt asks for it.


Claim 3 — After adjusting for repo size, AI repos still score lower

At first glance, the difference looks huge.

Average scores:

GroupAverage score
Non-AI83.4
AI-marked57.7

That is a 25.7 point gap.

But that number alone is misleading.

Larger repositories almost always score lower because bigger codebases accumulate more findings over time. AI-marked repos also tend to be larger.

So we compared repositories inside the same size ranges.

Repo sizeNon-AI avgAI avgGap
<100 files92.481.4-11.0
100–50069.061.0-8.0
500–2,00057.148.2-8.9
2,000+50.042.1-7.9

After controlling for size, AI-marked repos still scored:

8–11 points lower in every bucket

That result became slightly stronger in the larger dataset.

The important part is this:

  • the effect is real
  • it appears consistently
  • but it is much smaller than the raw 25-point headline number

Claim 4 — Detecting AI usage strongly predicts lower scores

One of the most surprising findings stayed almost unchanged.

The ai-stack check does not affect the final score.

It only detects signs of AI assistant usage.

Even so, it became the:

second-strongest predictor of a low score

CheckCorrelation with score
hotspot-0.638
ai-stack-0.477
coupling-0.335
line-count-0.308
import-graph-0.307

In simple terms:

Knowing whether a repo uses AI predicts quality almost as well as measuring its busiest files.

Some of this comes from repo size. AI repos are often larger. But the signal still remains after correcting for that.


Claim 5 — Most AI-marked repos contain explicit AI commit trailers

This run introduced a cleaner way to measure AI-assisted coding.

Previously, we counted many different markers:

  • config folders
  • AI SDKs
  • rules files
  • commit trailers

But not all of those prove AI-generated code was actually written.

A dependency on an AI SDK only means the project called an LLM somewhere.

Commit trailers are different.

Example:

Co-Authored-By: Claude

That is an explicit signal added by a user or tool.

In this dataset:

84.5% of AI-marked repos had at least one AI commit trailer

The average AI-marked repository had trailers on 6.3% of commits.

Most teams are not using AI for every commit. They are using it selectively.

Going forward, we’ll likely treat commit trailers as the primary definition of “AI-assisted coding”.


What stayed true

Three earlier findings became stronger in the larger dataset:

1. The secrets gap widened

  • Previously: 2.7×
  • Now: 3.2×

2. ai-stack stayed a top predictor

  • Previously: -0.438
  • Now: -0.477

3. The size-controlled AI penalty remained

  • Previously: 7–8 points
  • Now: 8–11 points

What changed

Some earlier conclusions did not survive the larger sample.

The language story became weaker

We previously said JavaScript repos showed AI markers far more often than Rust repos.

That gap shrank significantly once the sample grew.

We would not repeat that claim today.

The cleaner language-level signal now comes from commit trailers:

  • C++ and C showed the highest AI trailer rates
  • Go and Rust showed the lowest

The “D or F” rate dropped

Previously:

  • 38.4% of first-time scans scored D or F

Now:

  • 28.6%

Why?

The newer dataset contains many small, clean Go libraries that pulled scores upward.


Repo age does matter a bit

We originally reported almost no relationship between repository age and score.

That turned out to be wrong.

A bug had zeroed out repo age values in much of the earlier dataset.

With corrected data:

  • older repos score slightly lower on average
  • mostly because older repos also tend to be larger

The effect becomes much weaker after adjusting for size.


What we still cannot claim

There are still things we are deliberately avoiding over-claiming.

Per-tool rankings

We can already separate repos by tools like:

  • Cursor
  • Claude
  • GitHub Copilot
  • Claude Code

Some early differences are visible.

But the sample sizes are still too small to publish reliable rankings.

For example:

  • per-tool secret rates currently range from 48% to 60%
  • F-grade rates range from 64% to 80%

That spread is interesting, but not stable enough yet.


CLI usage patterns

The dataset also includes 758 CLI scans.

But most of them appear to come from:

  • local testing
  • repeated installs
  • broken anonymous runs

Until that data is cleaned up, we are excluding it from public-facing claims.


Final takeaway

We scanned 5,299 repositories over 32 days.

The main findings stayed surprisingly consistent:

  • AI-marked repos commit secrets more often
  • they score lower even after adjusting for size
  • and AI usage strongly predicts lower repository scores

But the larger dataset also made one thing clearer:

This is probably less about the AI tools themselves, and more about workflow habits.

AI assistants generate code very quickly. That changes the speed of development.

What has not changed yet is the review process around that code.

The teams getting the best results from AI are usually the ones still applying the same habits they used before:

  • reviewing generated code carefully
  • separating secrets from source
  • checking architecture decisions
  • cleaning up before commit

The assistant can write code fast. The responsibility to review it still belongs to the developer.

dataAIresearch
Scan your repo

Know what you shipped.

Install the CLI and see your score in under a minute. No account required.

Get started Book a review