A nightstand noise generator based on M5Stack Atom Echo and integrating with Home Assistant
1# Device operating modes and firmware behavior
2
3Companion to [mqtt-contract.md](./mqtt-contract.md). Where the contract defines the **wire protocol** between the device and HA, this doc defines the **device's own behavior** — what it does locally, independent of the protocol.
4
5## Scope
6
7Covers:
8- Operating mode state machine (BOOT / ONLINE / OFFLINE)
9- Button behavior in each mode
10- LED status colors
11- Persistent state (what's saved in NVS, what resets)
12- Boot behavior and state restoration
13- Reconnection strategy
14
15Does not cover: the Rust implementation details (that's firmware code), the audio signal chain (see [signal-chain.md](./signal-chain.md)), or the MQTT wire format (see [mqtt-contract.md](./mqtt-contract.md)).
16
17## Operating modes
18
19```
20 power on
21 │
22 ▼
23 ┌────────┐
24 │ BOOT │ Initializing peripherals, restoring NVS state, attempting WiFi
25 └────┬───┘
26 │
27 │ WiFi ok + MQTT ok WiFi fails OR MQTT fails
28 │ │ │
29 ▼ ▼ ▼
30 ┌──────────┐ ┌───────────┐
31 │ ONLINE │◄─── reconnect ──│ OFFLINE │
32 │ │ │ │
33 │ Button → │ │ Button → │
34 │ MQTT │─── drop ────────► local │
35 │ events │ │ effect │
36 └──────────┘ └───────────┘
37```
38
39### BOOT
40
41Entered on power-on or reset. Responsibilities:
421. Initialize I2S, GPIO, NVS, RGB LED
432. Read persistent state from NVS: `volume_index`, `volume_direction`, `was_playing`
443. Read the STA MAC; the lowercase 12-char hex is the device's identity for MQTT topics and discovery `unique_id`s (see [`mqtt-contract.md`](./mqtt-contract.md))
454. **If `was_playing == true`**: start white noise generator immediately at saved volume (power-blip recovery — don't wake the user with silence)
465. Attempt WiFi connect against compile-time stored credentials, with 60s retry on failure
476. If WiFi connects, attempt MQTT connect (the C MQTT client manages its own reconnect)
487. Transition to ONLINE or OFFLINE based on outcome
498. On the first successful MQTT `Connected` event, call `esp_ota_mark_app_valid_cancel_rollback` to confirm the running firmware (see "OTA + rollback" below)
50
51BOOT should complete to some steady mode within ~45 seconds worst case.
52
53### ONLINE
54
55WiFi up, MQTT connected, discovery configs published, subscribed to command topics.
56
57- Button events publish to `nightstand/<mac_hex>/button`
58- Commands from `nightstand/<mac_hex>/cmd/+` are received and acted on
59- State changes publish to `nightstand/<mac_hex>/state` (retained)
60- Availability topic shows `online`
61- **Short press round-trips through HA**: button publishes `{"event_type":"short"}`, HA decides whether it's bedtime or morning and publishes back `cmd/play ON`/`OFF`; the device does NOT toggle audio locally. Long-press still cycles volume locally (and publishes the event for HA logging). Double-press is publish-only — pure HA gesture.
62
63If MQTT drops (broker down, network partition, etc.): transition to OFFLINE.
64
65### OFFLINE
66
67WiFi or MQTT not available. Device keeps working locally.
68
69- Button events **not published** (there's nobody listening)
70- **Short press toggles white noise locally** — the network task supplies this fallback so muscle memory still works without HA
71- Long press cycles volume locally (yo-yo through preset list — see "Button behavior")
72- Double press is detected by firmware but has no local effect (no lights to control offline; serial log notes that the late-night-lights routine is online-only)
73- No commands are received
74- Background task retries WiFi + MQTT every 60s; on success, transition to ONLINE
75
76Offline mode is the primary travel mode — when the device wakes up in a hotel room and doesn't see the home WiFi, it just works as a standalone white noise machine with a button.
77
78## Button behavior matrix
79
80Full combined table (pairs with the state machine in `mqtt-contract.md`):
81
82| Input | ONLINE | OFFLINE |
83| --- | --- | --- |
84| Short press | Publish `{"event_type":"short"}` only; HA decides and publishes back `cmd/play ON`/`OFF` (round-trip) | Toggle white noise locally (network task supplies the fallback) |
85| Long press (≥2s) | Cycle volume preset locally (yo-yo) + publish `{"event_type":"long"}` | Cycle volume preset locally (yo-yo) |
86| Double press | Publish `{"event_type":"double"}` (HA late-night-lights routine — no local effect) | Detected, no-op (no lights to control offline) |
87
88**Long-press cycles volume identically in both modes** — muscle memory doesn't change for volume. **Short-press is the deliberate divergence**: when HA is online we let it decide (is it bedtime? morning? toggle just this nightstand or both?), and when HA is offline the device stands alone. **Double-press is online-only** — it's a pure HA gesture with no useful local fallback.
89
90**Volume cycle**: long-press advances through `[10%, 25%, 50%, 75%, 100%]` in the current direction; when it hits an end, the next long-press flips direction (yo-yo). Both volume index and direction are persisted in NVS so the cycle resumes from where you left off after a reboot.
91
92**Double-press as the late-night gesture**: when one of us has to get up to check on something at 3 AM, double-tapping the nightstand button asks HA to bring up the outdoor and downstairs lights. Detected and emitted by firmware in both modes, but only meaningful when online — offline just logs a note that the routine isn't available.
93
94**Latency note**: short-press has ~400 ms of detection latency (we have to wait for the double-press window to close before knowing it's a single). Imperceptible for sleepy-user use cases; the cost we pay for unambiguous gesture detection.
95
96## LED status colors
97
98The SK6812 behind the button cap is the only status indicator. Goal: visible enough to read at 1m, dim enough to not light the room at night.
99
100All colors are at **dim brightness** (~5-10% of full) unless noted.
101
102The LED state machine composes a base color from two orthogonal axes — audio playback state and network state — and applies overrides for OTA and unrecoverable errors on top. `Updating` and `Error` win over the base; `PressFlash` is a transient brightening overlay that decays over ~150 ms.
103
104| Audio × Net | Color | Pattern |
105| --- | --- | --- |
106| Connecting (any audio) | Cyan | Slow pulse (~1 Hz) |
107| Online, idle | Green | Solid, very dim |
108| Online, playing | Green | Solid, medium-dim |
109| Offline, idle | Amber | Solid, very dim |
110| Offline, playing | Amber | Solid, medium-dim |
111
112| Override | Color | Pattern |
113| --- | --- | --- |
114| OTA download in progress | Magenta | Slow pulse (~1.25 Hz) |
115| Error (I2S init failed, etc.) | Red | Slow blink (~2 Hz) |
116| Button press ack | Brighten the current color ~50 % | Decays over 150 ms |
117
118The button-press flash is a nice tactile confirmation — press, see a brief brighter pulse, know it registered even in the dark.
119
120The OTA-failure path explicitly clears the magenta override (via an internal `UpdateDone` signal from the OTA worker) so a failed install drops the LED back to the audio×net base color instead of leaving it stuck pulsing magenta forever.
121
122## Persistent state (NVS)
123
124Stored in ESP32's NVS flash partition. Survives power loss, restarts, even OTA updates (separate partition from app binaries).
125
126| Key | Type | Purpose | Written when |
127| --- | --- | --- | --- |
128| `volume_index` | u8 | Index 0..=4 into `VOLUME_PRESETS = [10, 25, 50, 75, 100]` | Long-press cycles the index, or HA sets volume |
129| `volume_direction` | u8 | 0 = Up, 1 = Down — the current yo-yo direction | Flipped when index hits an end of the preset list |
130| `was_playing` | u8 (0/1) | Whether white noise was playing at last state change | Every play/stop transition |
131
132Not stored (computed / volatile):
133- Logical name (derived from MAC)
134- Connection state
135- RSSI, uptime
136
137**WiFi credentials**: not in NVS — they live in `firmware/cfg.toml` (gitignored) and are baked into the binary at compile time via `toml-cfg`. This is a v1 simplification; an NVS-backed multi-SSID list is a future v2+ enhancement (home + travel router + backup, tried in order during BOOT).
138
139## Boot-time state restoration
140
141Design decision: **if the device was playing white noise before losing power, it resumes automatically on boot.**
142
143Rationale:
144- Power blips happen. If the device came back silent at 3 AM, the user wakes up.
145- If the user deliberately turned it off (via button or HA), `was_playing` is already `false` in NVS, so it stays off.
146- The check is `was_playing` (what was the last state I was asked to be in?), not "am I currently playing" (obviously no, I just booted).
147
148Edge case: if WiFi/MQTT never connect and `was_playing` was true, the device plays offline from the start. Correct behavior.
149
150## Reconnection strategy
151
152When OFFLINE (either never connected or dropped from ONLINE):
153
154- Retry WiFi every **60 seconds**
155- If WiFi connects, retry MQTT within 10s
156- On MQTT success, transition to ONLINE:
157 - Re-publish retained discovery configs (idempotent, covers HA restart during our offline window)
158 - Re-publish retained `online` to availability topic
159 - Re-publish retained state snapshot to `nightstand/<mac_hex>/state`
160 - Re-subscribe to command topics
161- On MQTT fail, stay OFFLINE; try again in 60s
162
163No exponential backoff — device is wall-powered, we don't care about battery life, and 60s is a reasonable balance between "react to the network coming back" and "not spam the broker during multi-hour outages."
164
165## OTA + rollback
166
167The firmware ships with a two-OTA partition layout (`ota_0` and `ota_1`, each 1.875 MB) plus an `otadata` partition that records which slot is active. New firmware is written to the *inactive* slot via `esp_https_ota`; on success, otadata is flipped and the device reboots into the new slot.
168
169### Partition layout (4 MB ESP32-PICO-D4)
170
171| Region | Offset | Size | Purpose |
172| --- | --- | --- | --- |
173| bootloader | `0x01000` | 28 KB | ESP-IDF stage-2 loader |
174| partition table | `0x08000` | 4 KB | This file's binary form |
175| nvs | `0x09000` | 24 KB | Volume, was_playing |
176| otadata | `0x0F000` | 8 KB | Active-slot pointer |
177| phy_init | `0x11000` | 4 KB | RF calibration (regenerated if missing) |
178| ota_0 | `0x20000` | 1.875 MB | App slot A |
179| ota_1 | `0x200000` | 1.875 MB | App slot B |
180
181NVS sits at the same offset as the single-slot v0.1.0/v0.2.0 layout, so the partition swap preserves persisted audio state. The 56 KB gap between phy_init and ota_0 is the cost of the 64 KB alignment requirement on app partitions.
182
183### Pending-verify and `mark_app_valid`
184
185After an OTA reboot, the new firmware boots in **pending-verify** state. The bootloader expects the running app to call `esp_ota_mark_app_valid_cancel_rollback` once it's confident things work; if a reset happens before that call, the bootloader reverts to the previous slot on the next boot. The firmware calls this on the first MQTT `Connected` event — proving WiFi and the broker both work, which is the device's primary job. Wired-flashed firmware isn't in pending-verify state, so the call is a no-op (documented behavior).
186
187If MQTT never connects after an OTA, the device will roll back on the next reset and come up on the previous version. HA notices the `installed_version` in `update/state` reverted; the update card flips back to "Update available."
188
189The trade-off is that any post-OTA reset before MQTT comes up looks like a rollback. In practice that means: don't power-cycle a device for 30 s after clicking Install. Watching the LED flip from magenta → cyan → green is the proxy for "OTA succeeded."
190
191### One-time wired migration
192
193The two-OTA layout is *not* the default ESP-IDF partition table. Devices going from v0.1.0 / v0.2.0 → v0.3.x must be wire-flashed once to write the new partition table; from v0.3.0 onward, every bump is OTA. The Makefile's `flash` target writes the new bootloader, partition table, and otadata in addition to the app, so the migration is a single `make flash`.
194
195## Error handling
196
197| Error | Behavior |
198| --- | --- |
199| I2S driver init fails | Red blink, no audio. Stay in whatever connection mode works. Log loudly. |
200| NVS read fails | Use defaults (volume_index=2 (=50%), volume_direction=Up, was_playing=false). Log. |
201| NVS write fails | Log, keep running. State won't persist across reboot but that's a graceful degradation. |
202| WiFi password wrong | Stay OFFLINE forever until updated. No good recovery. |
203| MQTT broker unreachable | Stay OFFLINE, retry per strategy above. |
204| OTA download fails | Keep running current firmware. Log. Republish `update/state` with `in_progress: false` so HA's progress bar disappears. LED reverts from magenta to the audio×net base color. |
205| OTA boot fails / app crashes before mark_valid | ESP-IDF's two-partition rollback auto-reverts to the previous slot. Device comes back on the old version; HA sees `installed_version` revert and lights up the "Update available" card again. |
206
207## What's not in this doc
208
209- **WiFi provisioning mechanism** — credentials are baked into the binary at compile time via `cfg.toml`. SoftAP/BLE/Improv provisioning is a possible future addition.
210- **Audio generation parameters** — the actual noise generator's filter shape, amplitude, etc. live in source and are tuned over time.
211- **Secure boot / signed firmware** — we don't sign images. Threat model is LAN-only, same as MQTT being plain.
212
213## Sources
214
215- [MQTT Contract](./mqtt-contract.md) — companion doc; wire protocol
216- [Signal chain](./signal-chain.md) — hardware audio path
217- [Atom Echo pinmap](./atom-echo/pinmap.md) — GPIO usage
218- [ESP-IDF NVS documentation][esp-idf-nvs]
219- [ESP-IDF OTA documentation][esp-idf-ota]
220- [ESP-IDF App rollback (mark_app_valid)][esp-idf-rollback]
221
222[esp-idf-nvs]: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-reference/storage/nvs_flash.html
223[esp-idf-ota]: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-reference/system/ota.html
224[esp-idf-rollback]: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/api-reference/system/ota.html#app-rollback