- Concrete bug-fix packets with explicit change surfaces
- Mixed-language repos when the task names the real adapter or converter path
- Small-repo ergonomics now that `analyze --static --json` stays compact by default
How task packets hold up on public repos.
These scorecards turn the benchmark suite into a product readout. The question is not just whether Cartograph emits a packet. The question is whether the packet changes what a developer or coding agent reads next.
Where Cartograph is already strong.
- Trace-flow validation targeting in large monorepos
- Broad task packets that span frontend and backend simultaneously
- Framework-heavy repos where many tests match the same generic terms
Four benchmark cases that show the current shape of the product.
`llama.cpp` bug-fix
The packet stays on the Dots OCR adapter and GGUF conversion surface, then lands on the exact Dots OCR GGUF test as the first validation target.
`fastapi` bug-fix
Source-side focus is good. The packet prioritizes dependency and exception surfaces correctly, but test targeting still drifts to less relevant validations.
`next.js` trace-flow
The packet stays on router and route code instead of collapsing into fixtures, but the first validation targets are still more e2e-heavy than ideal.
`open-webui` task
Backend state and redis surfaces rank highly, but broad task packets still admit generic structural files like `README.md` too easily.
Why this matters as a product signal.
Cartograph should get out of the way.
Compact `analyze` output is the right default. Direct reads are often faster, and the product should acknowledge that.
`packet` and `context` start paying off.
This is where the product shifts from “interesting” to “time saver.”
`analyze` becomes triage and handoff infrastructure.
The product value is not raw summary. It is cutting the next read list down to the useful shape.