> BenchProctor / blog

BenchProctor blog

Engineering notes

How we build a SAST benchmark you can't game — methodology, scoring, coverage, and the occasional war story from generating millions of labeled test cases.

· scoring, methodology

How BenchProctor scores a SAST tool

The whole scoring model is a confusion matrix and one subtraction. Here's how true-positive and false-positive rates become a single number, why we average per category, and how the benchmark checks itself.

Read post →
· methodology, benchmarking

Why static SAST benchmarks rot — and what quarterly rotation fixes

A frozen benchmark measures memorization as much as analysis. Here's the failure mode, and how rotating the corpus on a seed keeps scores honest without breaking comparability.

Read post →
· announcement, methodology

Introducing BenchProctor: a SAST benchmark you can't game

A polyglot, anti-leakage, quarterly-rotated corpus for measuring how accurately a static analysis tool actually finds vulnerabilities — and how often it cries wolf.

Read post →