Testing the Game in a Terminal
Category: SDL Adventure Game
A while back I gave the game a terminal front end:
a terminal build that renders the whole thing as coloured ASCII art with
libcaca and needs no display server at all. It was a fun way to play over SSH —
but the more interesting thing it unlocked was hiding in plain sight. If the game
can run with no window, no GPU, and no sound card, then it can run in CI. That
makes it testable. So I finally built the test.
The part I’d been putting off
There’s a “Future” section at the bottom of my terminal plan that I’d never come
back to: a headless test target. The terminal build already does the hard part —
before SDL_Init it sets SDL_VIDEODRIVER=offscreen and creates a software
renderer, so every frame is drawn into an ordinary RGBA buffer instead of onto a
screen (the original change).
Nothing about the game logic knows or cares. So a test is just: script some
clicks, run the loop, and check the game did what it should.
I already had a version of this for the web build — a small Puppeteer script that loads the WebAssembly page, clicks its way through Gina’s whole adventure, and asserts that the right lines of dialogue show up in the console. What I was missing was the same idea natively, with no browser in the loop.
make test
The new target builds a vaniavolpe_test binary: the terminal build minus
libcaca, using the offscreen video driver, a software renderer, and the dummy
audio driver. Instead of reading events from a terminal, it pushes a scripted
list of mouse clicks straight onto SDL’s event queue — the game’s
process_input can’t tell them apart from a real click, so every scene, minigame
and state transition runs exactly as it does for a player.
Then it checks the game did the adventure. Rather than compare screenshots, it
watches what the game says — “Ho preso gli occhialini!”, “Cestino pieno d’uva!”,
“Ecco il tuo salvagente!” — and asserts those lines appear in order, from the hub
all the way to the dive. If one is missing, it reports which and exits non-zero.
It also reads one frame back with SDL_RenderReadPixels and checks it isn’t a
single flat colour, a cheap way to catch the classic “black screen / missing
texture” regression without pinning down exact pixels.
Capturing what the game says
That part took two goes. In the first version the game printed its dialogue with
printf, so the harness just redirected its own stdout to a file with freopen,
ran the playthrough, and read the file back to check the lines. It worked, but it
always felt like a hack — shuffling a temp file around, and relying on the game
scattering printfs to stdout.
So I did the cleaner thing: moved all the game’s dialogue and messages onto
SDL_Log (which is where the engine’s own diagnostics already went), and gave
the harness an SDL_LogSetOutputFunction sink — a small callback SDL hands every
log line, which the harness appends to an in-memory buffer and tees to stderr.
Now the assertions read that buffer directly: no temp file, no freopen, no
stdout juggling, all in-process. It was a nice two-for-one — the game logs
consistently everywhere (terminal, logcat, the browser console), and the test
gets a clean stream to match against for free.
That’s it: make test && ./vaniavolpe_test, twelve green checks, exit 0. A
GitHub Actions job runs it on every push and pull request, so a broken
playthrough now fails CI the same way a broken build does.
Why not golden screenshots
The obvious version of a “snapshot test” is to hash the rendered frame and diff it against a saved reference. I decided against making that the core of it. Exact pixel hashes are miserable in CI — they wobble between SDL versions and CPUs, and every tiny art tweak turns the whole suite red for no real reason. Asserting on the dialogue the game produces is far more stable, and it’s testing the thing I actually care about: can the hen get through her adventure.
One honest limitation: the playthrough runs at real wall-clock speed, because the
animation and talk-duration code reads SDL_GetTicks() directly, so walks and
spoken lines take as long as they take — the whole run is about three-quarters of
a minute. A faster, perfectly deterministic version would need a clock I can
fast-forward, which means threading a time source through the engine. That’s a
bigger change than it’s worth right now; the timed smoke test is robust enough to
be useful today.
It’s a nice payoff for a front end I mostly built as a novelty. The terminal renderer turned out to be the cheapest possible test harness — a whole game loop that runs anywhere, watched not by a person squinting at ASCII art but by a machine reading the lines it prints.