The Performance Loop — Johnny Castaway PS1

Performance work on this project became useful only when it became boring. Run the same scene. Capture the same frames. Record the same counters. Change one thing. Run it again. Leave it running long enough for rare failures to show up.

The project needed unattended repetition before it could distinguish a real stability improvement from one good boot. The loops have to run long enough for the rare failures to surface, which means leaving them running for days on end without a person at the keyboard.

Why Printf Mattered

For a while, printf was both the obvious tool and part of the problem. Too much unbounded TTY output could destabilize the runtime. That forced the project into visual debugging first. Later, bounded performance logs came back as JCPERF and JCPERF2 lines with explicit levels:

Summary logs for scene setup and teardown.
Detail logs for frame behavior.
Debug logs only when the run is short enough to survive the noise.

That one change turned “I think the renderer is slow here” into a trace that could be compared across builds.

The Loop That Actually Worked

The useful loop combines:

Regression testing for repeatable boots and captures.
The headless-perf battle card for the live timing matrix — 126 scene/tide variants with sortable, color-coded target_speed cells.
Performance docs for counter definitions and known bottlenecks.
Regtest reference cases for frozen host baselines.
Visual debugging for the cases where numbers pass and pixels fail.
Lab retrospectives for the human side of keeping a one-person build farm honest. Two specific ones to start with: from 87 to 99.5 on the post-validation perf arc, and the v0.8.1 MARY 4 freeze on the soak loop catching what the matrix can’t.

The matrix and the soak are not redundant. The matrix runs each variant deterministically from a clean boot — perfect for “is this scene slow on its own?” The soak runs the actual screensaver loop randomized for hours — perfect for “is this scene broken by its neighbors?” A release that promotes performance without exercising both is a release that ships the next state-coupling bug, signed off and timed.

This is the same lesson as the rest of the port: make the feedback loop narrow enough that the machine can answer. The PS1 is not vague. It does exactly what you told it to do. The hard part is arranging your tools so you can hear the answer.

Hack: start here — first-day flow if you haven’t already done the build and boot loop.
Hack: learn C from Johnny — the printf-as-tool-with-blast-radius habit reads differently after the visual-debugging detour above.
Hack: visual debugging — the screenshot-and-overlay loop the matrix runs alongside.
Hack: memory wars — most of the matrix’s interesting failures trace back to the budget overruns this page’s siblings document.
Hack: port to a new platform — the loop discipline travels even when the toolchain doesn’t.
Performance battle card — live 126-variant matrix the loop above writes into.
Performance reference — column meanings, counter definitions, experiment-log discipline.
Lab: from 87 to 99.5 — the retrospective on the optimization arc this loop drove.
Lab: build farm — the 24/7 Docker-runner machinery wrapped around the loop.
Lab: dunking bird — the parallel-agent infrastructure that keeps the loop productive between human review passes.
Glossary: JCPERF / JCPERF2 · Glossary: scene-end · Glossary: soak-test

Why Printf Mattered

The Loop That Actually Worked

Related pages