Reference
PS1 hardware constraints
What the original PlayStation actually gives you, and what bites in practice when you try to use it for a 1992 screensaver port.
~7 min read · 1887 words
A labor of love by Hunter Davis. This page is a reference manual for the hardware the PS1 port actually runs on — and a short list of the places where that hardware drew blood during development. If you paid for this, you were cheated. Open source and free.
On this page
- The shape of the machine in 2026
- Hardware reference
- What bit us in practice
- SPI pad polling needs
tx_len=5, not4 FntFlushis empirically broken in scene-runtime context- SPU HLE diverges from real hardware
- Dirty-rect bookkeeping eats
currDirtyon full restores - VRAM corruption when scenes don’t restore the CLUT
- TTY printf is the only debug surface, and it’s a sharp tool
vprintfwas unbounded, and that forceddebugMode=0
- SPI pad polling needs
- Where this leaves the runtime
- Related pages
- View source on GitHub
The shape of the machine in 2026
The PlayStation 1 was released in 1994. In 2026 it is a museum piece, but it is
also the machine that runs jcreborn.bin on real hardware and inside
DuckStation. The numbers that matter, in the order you usually trip over them:
- CPU: MIPS R3000A at 33.8688 MHz. 32-bit, no hardware floating point.
- System RAM: 2 MB. Not 2 GB. Not 2 hundred MB. Two megabytes.
- VRAM: 1 MB, owned by the GPU. Two 640×480 framebuffers eat 600 KB of that on their own.
- SPU RAM: 512 KB, owned by the audio co-processor. ADPCM samples only.
- CD drive: 2x speed, 300 KB/s sustained, ~150 ms cold seek.
- Controller: SIO0 serial bus, polled at vblank-ish rates.
For Johnny Castaway, a 1992 16-color VGA screensaver running at roughly 4 fps of foreground change, that’s an embarrassment of riches in some axes (a 33 MHz CPU never has to run anything close to a real game loop) and tight in others (2 MB has to hold the whole executable, the resource cache, the prepared foreground frames, and every per-frame buffer the renderer touches).
The author has spent more time fighting the I/O budget than the CPU budget. The CD’s 150 ms seek and the SPU’s separate-RAM model are what shape the runtime; the MIPS core itself is rarely the bottleneck. That distinction matters when reading the rest of these docs — most of the perf work is about hiding seeks, not making the CPU faster. See Performance work for the experiment ledger.
Hardware reference
CPU and memory
| Component | Spec |
|---|---|
| CPU | MIPS R3000A @ 33.8688 MHz |
| Floating point | None — soft-float via library |
| System RAM | 2 MB |
| VRAM | 1 MB |
| SPU RAM | 512 KB |
| BSS budget | Keep below ~50 KB for stability |
Compiler flags inherited from PSn00bSDK’s toolchain pin
-msoft-float -G0 -march=mips1 -mabi=32 -ffreestanding. The project layers
its own -O2 -Wall -Wpedantic -ffunction-sections -fdata-sections on top and
links with -Wl,--gc-sections so unused engine paths get stripped at the
linker. After that pass jcreborn.elf is around 924 KiB; the converted
jcreborn.exe lands at 208 KiB (104 × 2 KiB CD-ROM sectors) at
v0.8.12-ps1.
Graphics (GPU + VRAM)
The GPU is fixed-function. No shaders. Sprites, lines, triangles, quads, and
circles, with a hardware ordering table (OT) for back-to-front sorting.
Color is either 4-bit indexed (16-color CLUT), 8-bit indexed (256-color
CLUT), or 15-bit RGB. Johnny Castaway is a 16-color VGA program, so almost
everything renders through 4-bit CLUT spans.
Native resolution for the port is 640×480 interlaced, which exactly matches the source material. The 1 MB VRAM laid out as:
Region Size Location
-----------------------------------------
Framebuffer 0 ~300 KB (0,0) – 640x480x16
Framebuffer 1 ~300 KB (0,480) – 640x480x16
Sprite cache ~350 KB (dynamic)
CLUTs ~50 KB (palette tables)
-----------------------------------------
Total 1.0 MB
The “sprite cache” line is what does most of the actual rendering work — once a foreground frame is composed in main RAM, it’s uploaded into VRAM as a texture and drawn from there. Sprite cache contention is an issue for large scenes; for Johnny Castaway’s 16-color tiles it’s mostly fine.
Audio (SPU)
The SPU is a separate processor with its own 512 KB of dedicated RAM. It plays 24 simultaneous voices of 4-bit ADPCM at up to 44.1 kHz, mixes them in hardware, and supports per-voice ADSR and stereo panning. The CPU does not mix audio; it uploads samples once and triggers them via voice keys.
For this project the entire sound bank — 23 effect samples, total ~95 KB ADPCM — preloads into SPU RAM at boot. There is no streaming, no music. See Audio pipeline for the SPU layout map.
Storage (CD-ROM)
The CD is a 2x drive — 300 KB/s sustained, 150 ms cold seek. Both numbers matter. The seek dominates for small reads; the bandwidth dominates for the foreground packs (the largest active pack is ~1.75 MB). The runtime treats the CD as an explicit prefetch surface, not a transparent filesystem; reads are scheduled inside held VBlanks where possible, so seek latency lives under already-idle frames.
Input (controller)
Everything goes through the SIO0 serial bus. PSn00bSDK ships a BIOS pad
driver, but it does not auto-poll in the project’s PSn00bSDK 0.24 +
DuckStation environment. The port does its own SPI driver in src/spi.c,
adapted from spicyjpeg’s PSn00bSDK example, polling at 250 Hz off Timer 2.
What bit us in practice
Hardware specs are easy. The hard part is the half-dozen places where the hardware (or the SDK on top of it) works in a way that is not obvious from the specs and that costs days of debugging when first encountered.
SPI pad polling needs tx_len=5, not 4
The spicyjpeg PSn00bSDK examples/io/pads/ reference polls the controller
with a 4-byte TX sequence. That works on hardware. On DuckStation it
returns 0xFFFF — the button bytes are simply not delivered unless the
full 5-byte poll sequence comes from the TX buffer. The fix is one
constant; finding the constant took an evening of staring at SIO0 and
guessing whether the bug was in our code, the SDK, or the emulator. It was
in the emulator, and the emulator is what almost everyone playtests on, so
the fix is permanent. src/spi.c carries tx_len=5.
FntFlush is empirically broken in scene-runtime context
The PSn00bSDK font path (FntLoad / FntFlush) is a useful debug surface in
toy programs. In the scene-playback runtime context it is broken: text
primitives accumulate in the OT but produce no visible pixels, with no
diagnostic. The pause menu uses an embedded 8×8 ASCII font and
hand-drawn POLY_F4 quads instead. Do not regress to FntFlush for
on-screen text without a fresh empirical test.
SPU HLE diverges from real hardware
DuckStation’s SPU emulation is HLE (high-level emulation) by default, and
SpuSetCommonMasterVolume is one of the calls it does not honor — writes
to the SPU master-volume register go through, but the documented PSn00bSDK
helper does not actually mute the bus. The pause menu’s mute toggle writes
the SPU master-volume registers directly via *(uint16_t volatile *)0x1f801d80
to work around this. The author has not yet validated this on real
hardware; until that happens, treat the audio path on hardware as
“believed correct, not signed off.”
Dirty-rect bookkeeping eats currDirty on full restores
The graphics layer tracks dirty rectangles per tile so that only changed
rows are uploaded each frame. grRestoreBgTiles clears currDirty as a
side effect — which is correct for normal frames, but wrong for resume
paths where a full redraw needs the previous frame’s dirty extents too.
After the pause menu lands, returning to scene playback uses
grForceFullRedrawNextFrame so prevDirty and currDirty agree on the
post-resume frame. This was a one-line fix that took three runs of
“why did the screen come back wrong?” to find.
VRAM corruption when scenes don’t restore the CLUT
The 16-color palette (JOHNCAST.PAL) is loaded once at startup and never
changes — but specific scenes can stomp on the CLUT region of VRAM during
upload if the source rectangle math is off by even a few pixels. The
visible symptom is a single corrupted color across every scene that runs
afterward, until the next clean palette upload. The fix is to gate every
LoadImage call against the known CLUT rectangle and to keep palette
uploads aligned to 16-pixel boundaries.
TTY printf is the only debug surface, and it’s a sharp tool
DuckStation’s TTY logging works reliably as of 2026-04-25 — the project can
print from anywhere in the runtime and capture the output to a host log
file. But text I/O changes timing. A printf in a per-frame hot path will
move VBlank cadence enough to mask or invent the very bugs being debugged.
The project uses TTY for one-shot snapshots (JCPAUSE, JCPERF,
JCPERF2, JCPAD, JCSPI) and a colored on-framebuffer telemetry
overlay for steady-state visibility. See
Performance work for how the
perf module gates print levels.
vprintf was unbounded, and that forced debugMode=0
Early in the port, the legacy debug path called vprintf from a hot loop
without bounding the format buffer. Under sustained text output the stack
would walk into BSS and the executable would crash several minutes into a
session, sometimes silently. The proximate fix was to disable
debugMode in release builds; the deeper fix was to replace vprintf
with the bounded JCPERF/JCPERF2 gated formatters in src/ps1_perf.c,
which truncate inside a static buffer. The lesson, repeated across this
project: standard C library functions assume a host-class environment,
and the PS1 is not that.
Where this leaves the runtime
The current build (v0.8.12-ps1) lives well inside every hardware
budget: ~208 KiB of executable code, ~95 KB of audio in SPU RAM, two
640×480 framebuffers, and a few hundred KB of resource cache in main
RAM. The bottleneck that remains is CD-side prefetch timing, not memory or
CPU. If a future scene exposes a CPU bound, the next move is per-scene
specialized compositors, not a clock change — the clock isn’t going up.
Related pages
- Devices — what it runs on — the specific tested instances of the hardware envelope this page describes: DuckStation as the every-commit reference, SCPH-7501 via TonyHax as the smoke-tested real PS1, plus the should-work-unverified and unsupported lists.
- Build & toolchain — the cross-compile setup that targets this hardware.
- Build infrastructure — what the dev environment looks like beyond the toolchain.
- Performance work — the experiment ledger that lives on top of these constraints.
- Performance battle card — live per-scene timing against the hardware budget described above.
- Audio pipeline — SPU layout and the HLE-vs-hardware divergence in detail.
- API mapping (SDL2 → PSn00bSDK) — the call-by-call surface between the host build and the hardware described here.
- Lab: the two-day SPI bug
— the retrospective on the
tx_len=5pad-polling fix the hardware section above documents in one paragraph. - Story-loop walks — the walk subsystem’s persistent clean buffer is a direct response to the 2 MB envelope above; the page documents why re-allocating per walk fragmented the heap.
- Method — how a one-person port decides what to ship.
- Devlog — day-by-day worklog where most of these bugs got triaged.
- Glossary —
definitions for hardware-specific terms used above
(
SPU,VRAM,VBlank,OT,mkpsxiso,PSn00bSDK,TonyHax,SPI driver,tx_len). Grouped by area, not alphabetical.
View source on GitHub
docs/ps1/hardware-specs.md— original design doc.src/spi.c— the project’s own SPI driver (250 Hz Timer 2 polling,tx_len=5for DuckStation parity). Carries the fix the two-day SPI bug retrospective walks through.src/ps1_perf.c— boundedJCPERF/JCPERF2gated formatters; the safe substitute for per-frameprintf()in timing-sensitive paths.