BenchProctor blog
Engineering notes
How we build a SAST benchmark you can't game — methodology, scoring, coverage, and the occasional war story from generating millions of labeled test cases.
· scoring, methodology
How BenchProctor scores a SAST tool
The whole scoring model is a confusion matrix and one subtraction. Here's how true-positive and false-positive rates become a single number, why we average per category, and how the benchmark checks itself.
Read post → · methodology, benchmarking
Why static SAST benchmarks rot — and what quarterly rotation fixes
A frozen benchmark measures memorization as much as analysis. Here's the failure mode, and how rotating the corpus on a seed keeps scores honest without breaking comparability.
Read post → · announcement, methodology
Introducing BenchProctor: a SAST benchmark you can't game
A polyglot, anti-leakage, quarterly-rotated corpus for measuring how accurately a static analysis tool actually finds vulnerabilities — and how often it cries wolf.
Read post →