How We Test Vending Machine Firmware Without Touching a Vending Machine
In short: Vending on Track tests its vending machine firmware without hardware by booting the real STM32 control-board binary inside the Renode full-system emulator and driving the live MDB payment bus from an automated test script. Every code change runs the genuine payment handshake and a complete vend — including failure recovery — in continuous integration, catching charged-but-not-dispensed bugs before a single coin is ever inserted. Here is how the harness works and why it matters for the telemetry-connected machines we run.
Most of the vending industry runs its software like it’s 1998: flash a board, walk to a machine, feed it coins, and watch what happens. If something breaks, you flash again and walk back. It’s slow, it’s manual, and — most importantly — it only ever tests the handful of scenarios a tired engineer can be bothered to reproduce by hand.
We don’t do that. At Vending on Track, our STM32-based budget control board — the low-cost workhorse that sits behind our entry-level machines, like the compact wall-mount units we build for HIV self-test kit dispensing and free period-product dispensing (such as the Dignify wall-mounted vending machine and KioskForce’s free sanitary-product wall-mount projects) — is tested by booting the real, unmodified firmware inside a full-system CPU emulator and driving its payment bus from an automated test, on every change, in continuous integration. No board on a bench. No hardware in the loop. No coins.
Here’s the part that surprises people: this is the budget product line. We also build a far more advanced Linux-based control platform — and yet we’ve invested in this level of rigour for the cheap board, because our customers don’t get to choose which machine fails them on a Friday night. Reliability isn’t a premium feature. It’s the product.
The problem with how vending firmware is normally tested
A vending control board is a deceptively hard piece of software. Ours juggles several independent state machines at once:
- the MDB (Multi-Drop Bus) payment protocol talking to coin changers, note readers and cashless (card/NFC/mobile) readers;
- motor control and drop-sensor verification for actually dispensing the product;
- pricing, credit, refunds, and multi-coil product groups;
- an LCD, a keypad, and EEPROM-backed configuration.
All of that runs on a tiny STM32F103 with a ~6.5-second hardware watchdog breathing down its neck. A subtle bug in the payment handshake or the dispensing logic doesn’t show up as a crash — it shows up as a customer who got charged and didn’t get their drink.
The traditional way to catch those bugs is to be that customer, repeatedly, by hand. That approach has a ceiling, and the ceiling is low. You can’t hand-test what happens when a card reader sends a slightly malformed response, or when the configuration memory comes back corrupted, or when a coil jams on the third replacement attempt — not reliably, not on every commit, and not fast enough to matter.
What we built instead
We boot the firmware in Renode, an open-source full-system emulator, and let an external test script play the role of the payment hardware on the other end of the bus.
firmware (vending_machine_firmware.elf) test harness (Python)
─────────────────────────────────────── ─────────────────────
USART ── 9-bit MDB ──▶ MdbUart bridge ── TCP ──▶ fake card reader
I2C ── EEPROM + keypad models [data, mode] frames
ADC / DWT / clocks ── modelled enough to boot
The emulated CPU runs the exact same binary we flash onto a real board — same compiler, same flags, same vending_machine_firmware.elf. We don’t stub out the firmware or test a “model” of it. We test the firmware. The only thing that’s fake is the world around it: a Python script connects over a socket and impersonates a cashless reader, speaking real MDB frames back to the firmware’s payment state machine.
From that harness we drive complete, end-to-end scenarios and assert on the firmware’s own debug output. Two of our standing regression suites:
1. The full cashless payment handshake. The harness drives our cashless.c state machine through its entire setup negotiation — RESET → SETUP → MIN/MAX → EXPANSION ID → READER ENABLE — and asserts the firmware negotiates the correct configuration and reaches the ENABLED state, ready to take a payment. This is a real bidirectional 9-bit, multi-byte exchange over the bus, not a smoke test.
2. A complete vend, including failure recovery. This one is end-to-end: the harness opens a session and approves a payment, the firmware actually runs a dispense, and we hold the drop-sensor beam “clear” so the firmware never sees the product drop. That deliberately forces every dispense attempt to fail on timeout, which drives the control board down its replacement-vend chain — its logic for trying the next coil in a product group when one jams. We then assert it walks the coils in exactly the right order.
That second suite tests something almost no one tests by hand: what the machine does when dispensing goes wrong. The unhappy path is where customers actually get hurt, and it’s the path manual testing skips first.
The engineering that made it possible
Anyone can say “we use an emulator.” The interesting part is everything that didn’t work out of the box — because that’s where the real engineering is.
A 9-bit UART that didn’t exist
MDB is an unusual protocol: it uses the serial port’s 9th bit to mark which byte is an address and which byte ends a peripheral’s reply. Renode’s stock UART is byte-only and silently throws that 9th bit away — which would make it impossible for the firmware to find the end of a message. So we wrote our own 9-bit-aware UART peripheral in C# that preserves the mode bit end-to-end, carrying each frame to the harness as a [data, mode] pair. Without this, the entire payment bus is invisible. It’s the keystone of the whole harness.
Bus timing that mirrors real hardware
Real MDB runs at 9600 baud — roughly one byte every millisecond. Our firmware leans on that timing in a subtle way: it clears a UART error flag (which reads the data register) before each incoming byte, which is harmless on real hardware only because the next byte hasn’t arrived yet. If our emulated bus delivered a whole reply instantly, that clear would corrupt it. So our peripheral deliberately paces the bytes ~1 ms apart to reproduce the real bus, and we run the emulated core at its true ~8 MHz so guest time stays aligned with the real-time test harness. We didn’t just make it work; we made it work for the right reasons.
Every boot blocker, modelled honestly
Booting real firmware off-target surfaces a chain of hardware dependencies, each of which hung or rebooted the firmware until we modelled it properly: the clock tree’s startup spin-loops, the I2C EEPROM the firmware reboots without, the keypad it waits forever to acknowledge, the DWT cycle counter its microsecond delays busy-wait on, the ADC’s end-of-conversion flag, and the I2C controller itself (the stock model couldn’t drive our blocking driver, so we wrote an accurate one). None of these are shortcuts that bypass the firmware — they’re faithful enough models of real silicon to let the real boot sequence run.
Tests that are proven able to fail
This is the discipline that separates a test suite from theatre. Every assertion reflects genuinely correct behaviour, and we verify each one is able to fail for the right reason. Our replacement-vend test ships with a built-in negative control: the same vend, with our auto-balancing feature toggled off, must produce a different but equally specific dispense order. If the feature ever silently stopped working, the test couldn’t keep passing by accident. A test that can’t fail isn’t protecting you from anything.
Catching a real bug, the hard way
The harness has already earned its keep. We hit a bug (tracked internally as SD-4254) where a corrupted configuration memory — the all-0xFF pattern a blank or failed EEPROM returns — sent the firmware into an infinite loop during price lookup. On a real machine that’s a dead board in the field. We now have a regression test that seeds exactly that corrupt configuration before boot and asserts the firmware detects it and enters its factory-reset recovery path. Notably, an earlier, lazier version of our emulation had accidentally papered over this very bug by returning clean zeros for blank memory. Fixing the harness to model the hardware honestly is what let us reproduce — and permanently guard against — a genuine field-failure mode.
Why this matters to operators
If you run our machines — or rely on our connectivity platform and telemetry to keep them online — here’s what this engineering investment buys you, concretely:
- Payment bugs get caught before they ship. The exact MDB handshake your card reader performs is exercised automatically on every firmware change. Charged-but-not-dispensed is the cardinal sin of vending; we test against it continuously.
- Failure handling is a tested feature, not an afterthought. Jammed coils, corrupt config, and failed dispenses are reproduced deterministically in CI — the scenarios that are nearly impossible to trigger reliably by hand.
- Faster, safer updates. Because the whole suite runs in seconds on ordinary build servers with no hardware attached, we can ship firmware improvements with confidence instead of crossing our fingers.
- It scales. A bench can test one board at a time. A CI farm can test thousands of boots in parallel, every day, forever.
The point
The vending industry treats firmware as something you flash and forget, validated by whoever happened to be standing in front of the machine. We treat it as software — software that handles people’s money — and we hold it to a software-engineering standard: real binary, real protocol, automated, adversarial, and reproducible, running before a single coin is ever inserted.
We do this for our budget board. That should tell you something about the rest.
Building something that needs to talk to vending machines, or evaluating a hardware partner who actually engineers their firmware? Request a demo or get in touch — we’d love to show you what’s under the hood.