Rockbox open source high quality audio player as a Music Player Daemon
mpris rockbox mpd libadwaita audio rust zig deno
2
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge pull request #145 from tsirysndr/feat/airplay-multi-room

Add multi-room AirPlay support with receiver management

authored by

Tsiry Sandratraina and committed by
GitHub
e4877ded 82412690

+660 -413
+22 -3
README.md
··· 30 30 31 31 ### Audio output 32 32 - [x] Built-in SDL audio 33 - - [x] AirPlay (RAOP) — stream to Apple TV, HomePod, Airport Express, shairport-sync 33 + - [x] AirPlay (RAOP) — single or multi-room fan-out to Apple TV, HomePod, Airport Express, shairport-sync 34 34 - [x] Snapcast (FIFO/pipe) — synchronised multi-room via snapserver 35 35 - [x] Squeezelite (Slim Protocol + HTTP broadcast) — synchronised multi-room 36 36 - [x] Chromecast ··· 228 228 rockboxd | ffplay -f s16le -ar 44100 -ac 2 - 229 229 ``` 230 230 231 - ### AirPlay (RAOP) 231 + ### AirPlay (RAOP) — single or multi-room 232 + 233 + Single receiver: 232 234 233 235 ```toml 234 236 music_dir = "/path/to/Music" ··· 237 239 airplay_port = 5000 # optional, default 5000 238 240 ``` 239 241 242 + Multi-room (fan-out to N receivers simultaneously): 243 + 244 + ```toml 245 + music_dir = "/path/to/Music" 246 + audio_output = "airplay" 247 + 248 + [[airplay_receivers]] 249 + host = "192.168.1.50" # living room 250 + port = 5000 # optional, default 5000 251 + 252 + [[airplay_receivers]] 253 + host = "192.168.1.51" # bedroom 254 + # port defaults to 5000 255 + ``` 256 + 240 257 Streams ALAC-encoded audio over RTP to any RAOP-compatible receiver — Apple 241 258 TV, HomePod, Airport Express, or 242 - [shairport-sync](https://github.com/mikebrady/shairport-sync). 259 + [shairport-sync](https://github.com/mikebrady/shairport-sync). All receivers 260 + share the same `initial_rtptime`, so RTP-level playback synchronisation is 261 + within one frame (~8 ms) across the LAN. 243 262 244 263 ### Squeezelite (Slim Protocol — multi-room) 245 264
+319 -212
crates/airplay/README.md
··· 1 1 # rockbox-airplay — AirPlay PCM Sink 2 2 3 3 This document traces every hop an audio frame takes from the Rockbox C firmware 4 - through the `rockbox-airplay` Rust crate to an AirPlay (RAOP) receiver. 4 + through the `rockbox-airplay` Rust crate to one or more AirPlay (RAOP) 5 + receivers. 5 6 6 7 --- 7 8 ··· 18 19 9. [RTP audio stream (`rtp.rs`)](#rtp-audio-stream-rtprs) 19 20 10. [RTCP synchronisation](#rtcp-synchronisation) 20 21 11. [NTP timing responder](#ntp-timing-responder) 21 - 12. [Track transitions](#track-transitions) 22 - 13. [Configuration](#configuration) 23 - 14. [AirPlay 2 probe](#airplay-2-probe) 24 - 15. [Gotchas and known limits](#gotchas-and-known-limits) 22 + 12. [Multi-room fan-out](#multi-room-fan-out) 23 + 13. [Track transitions](#track-transitions) 24 + 14. [Configuration](#configuration) 25 + 15. [AirPlay 2 probe](#airplay-2-probe) 26 + 16. [Gotchas and known limits](#gotchas-and-known-limits) 25 27 26 28 --- 27 29 ··· 29 31 30 32 The AirPlay sink lets Rockbox stream audio to any RAOP-compatible receiver — 31 33 Apple TV, HomePod, Airport Express, or third-party software such as 32 - [shairport-sync](https://github.com/mikebrady/shairport-sync). It implements 33 - **AirPlay 1 (RAOP)** entirely in pure Rust with no external C libraries. 34 + [shairport-sync](https://github.com/mikebrady/shairport-sync). Multiple 35 + receivers can be configured simultaneously for multi-room playback. 36 + 37 + The implementation is **AirPlay 1 (RAOP)** in pure Rust with no external C 38 + libraries. AirPlay 2 pairing (HAP SRP6a + x25519 ECDH) is attempted as a 39 + non-fatal probe before falling through to the AirPlay 1 path. 34 40 35 41 The protocol stack looks like: 36 42 37 43 ``` 38 - RTSP/TCP ── session negotiation (ANNOUNCE, SETUP, RECORD, TEARDOWN) 39 - RTP/UDP ── ALAC-encoded audio frames 40 - RTCP/UDP ── synchronisation (NTP send-report) every ~350 ms 41 - UDP ── NTP timing response service 44 + RTSP/TCP ── session negotiation per receiver (ANNOUNCE, SETUP, RECORD, TEARDOWN) 45 + RTP/UDP ── ALAC-encoded audio frames (same frame broadcast to all receivers) 46 + RTCP/UDP ── synchronisation (NTP send-report) every ~350 ms per receiver 47 + UDP ── shared NTP timing response service (one port, all receivers) 42 48 ``` 43 49 44 50 --- ··· 46 52 ## Layer map 47 53 48 54 ``` 49 - ┌────────────────────────────────────────────────────────┐ 50 - │ Rockbox C firmware (pcm.c, audio thread) │ 51 - │ pcm_play_data() → sink.ops.play() │ 52 - │ pcm_play_dma_complete_callback() per chunk │ 53 - └───────────────────┬────────────────────────────────────┘ 55 + ┌─────────────────────────────────────────────────────────────┐ 56 + │ Rockbox C firmware (pcm.c, audio thread) │ 57 + │ pcm_play_data() → sink.ops.play() │ 58 + │ pcm_play_dma_complete_callback() per chunk │ 59 + └───────────────────┬─────────────────────────────────────────┘ 54 60 │ raw S16LE stereo PCM chunks 55 - ┌───────────────────▼────────────────────────────────────┐ 56 - │ firmware/target/hosted/pcm-airplay.c │ 57 - │ sink_dma_start() → pcm_airplay_connect() │ 58 - │ airplay_thread() → pcm_airplay_write() │ 59 - │ sink_dma_stop() → pcm_airplay_stop() │ 60 - └───────────────────┬────────────────────────────────────┘ 61 + ┌───────────────────▼─────────────────────────────────────────┐ 62 + │ firmware/target/hosted/pcm-airplay.c │ 63 + │ sink_dma_start() → pcm_airplay_connect() │ 64 + │ airplay_thread() → pcm_airplay_write() │ 65 + │ sink_dma_stop() → pcm_airplay_stop() │ 66 + └───────────────────┬─────────────────────────────────────────┘ 61 67 │ extern "C" FFI 62 - ┌───────────────────▼────────────────────────────────────┐ 63 - │ crates/airplay/src/lib.rs │ 64 - │ AirPlaySession { sender, rtsp, buf, first_frame } │ 65 - │ pcm_airplay_connect() — RTSP handshake │ 66 - │ pcm_airplay_write() — ALAC frame dispatch │ 67 - │ pcm_airplay_stop() — TEARDOWN + session clear │ 68 - └───────┬───────────────────────┬────────────────────────┘ 69 - │ RTSP/TCP │ ALAC frames 70 - ┌───────▼────────────┐ ┌───────▼──────────────────────┐ 71 - │ rtsp.rs │ │ alac.rs │ 72 - │ RtspClient │ │ encode_frame() │ 73 - │ ANNOUNCE / SETUP │ │ BitWriter │ 74 - │ RECORD / TEARDOWN │ │ 352 S16LE → 1411-byte frame │ 75 - └────────────────────┘ └───────┬──────────────────────┘ 76 - │ encoded frames 77 - ┌───────▼──────────────────────┐ 78 - │ rtp.rs │ 79 - │ RtpSender │ 80 - │ send_audio() — RTP/UDP │ 81 - │ send_sync() — RTCP │ 82 - │ timing_responder() — NTP │ 83 - └──────────────────────────────┘ 84 - │ UDP packets 85 - ┌───────▼──────────────────────┐ 86 - │ AirPlay receiver │ 87 - │ (Apple TV, shairport-sync…) │ 88 - └──────────────────────────────┘ 68 + ┌───────────────────▼─────────────────────────────────────────┐ 69 + │ crates/airplay/src/lib.rs │ 70 + │ AirPlaySession { │ 71 + │ receivers: Vec<ReceiverHandle>, │ 72 + │ rtsp_clients: Vec<RtspClient>, │ 73 + │ timing: TimingSocket, ← shared, one port │ 74 + │ pacing: PacingClock, ← shared clock │ 75 + │ buf, first_frame, │ 76 + │ } │ 77 + │ pcm_airplay_connect() — handshake per receiver │ 78 + │ pcm_airplay_write() — encode once, fan out │ 79 + │ pcm_airplay_stop() — TEARDOWN all + session clear │ 80 + └───┬───────────────────────┬─────────────────────────────────┘ 81 + │ RTSP/TCP (per rx) │ ALAC frames 82 + ┌───▼────────────┐ ┌───────▼─────────────────────────────────┐ 83 + │ rtsp.rs │ │ alac.rs │ 84 + │ RtspClient │ │ encode_frame() — called once/frame │ 85 + │ ANNOUNCE │ │ BitWriter │ 86 + │ SETUP │ │ 352 S16LE → 1411-byte verbatim frame │ 87 + │ RECORD │ └───────┬─────────────────────────────────┘ 88 + │ SET_PARAMETER │ │ encoded frame (shared reference) 89 + │ TEARDOWN │ ┌───────▼─────────────────────────────────┐ 90 + └────────────────┘ │ rtp.rs │ 91 + │ ReceiverHandle (per receiver) │ 92 + │ send_audio_packet() — RTP/UDP │ 93 + │ send_sync() — RTCP │ 94 + │ TimingSocket (shared, one port) │ 95 + │ timing_responder() — NTP thread │ 96 + │ PacingClock (shared) │ 97 + │ pace() — one sleep for all rooms │ 98 + └───────┬─────────────────────────────────┘ 99 + │ UDP packets (fan-out) 100 + ┌─────────────┼─────────────┐ 101 + ┌──────▼──────┐ ┌────▼──────┐ ┌───▼──────┐ 102 + │ Receiver 1 │ │ Receiver 2│ │ … │ 103 + └─────────────┘ └───────────┘ └──────────┘ 89 104 ``` 90 105 91 106 --- ··· 99 114 |-------------------|---------------------------------------------------------------------| 100 115 | `init` | `pthread_mutex_init` (recursive) | 101 116 | `postinit` | no-op | 102 - | `set_freq` | records `current_sample_rate` from `hw_freq_sampr[freq]` | 117 + | `set_freq` | no-op (sample rate is fixed at 44100 Hz) | 103 118 | `lock` / `unlock` | `pthread_mutex_lock/unlock` | 104 - | `play` | `sink_dma_start` — connects, spawns `airplay_thread` | 119 + | `play` | `sink_dma_start` — connects all receivers, spawns `airplay_thread` | 105 120 | `stop` | `sink_dma_stop` — signals thread, joins, calls `pcm_airplay_stop()` | 106 121 107 122 `airplay_pcm_sink` is registered at index `PCM_SINK_AIRPLAY = 2` in the ··· 124 139 5. pcm_play_dma_status_callback(STARTED) ← tells audio engine chunk consumed 125 140 ``` 126 141 127 - Unlike the FIFO sink, there is **no explicit real-time pacing** in C. Pacing is 128 - handled inside `rtp.rs` — the RTP sender sleeps to maintain the correct 129 - wall-clock transmission rate based on the RTP timestamp increment. 142 + Real-time pacing is handled inside `PacingClock` in `rtp.rs` — the shared 143 + clock sleeps once per frame after fanning out to all receivers. 130 144 131 145 --- 132 146 133 147 ## FFI boundary 134 148 135 - `crates/airplay/src/lib.rs` exports three `#[no_mangle] extern "C"` functions: 149 + `crates/airplay/src/lib.rs` exports these `#[no_mangle] extern "C"` functions: 136 150 137 - | C symbol | Rust function | Purpose | 138 - |------------------------|------------------------|--------------------------------------| 139 - | `pcm_airplay_set_host` | `pcm_airplay_set_host` | Store `HOST` + `PORT` atomics/mutex | 140 - | `pcm_airplay_connect` | `pcm_airplay_connect` | Open RTSP + RTP session (idempotent) | 141 - | `pcm_airplay_write` | `pcm_airplay_write` | Buffer PCM, encode ALAC, send RTP | 142 - | `pcm_airplay_stop` | `pcm_airplay_stop` | Send TEARDOWN, clear session | 151 + | C symbol | Purpose | 152 + |-----------------------------|----------------------------------------------------------| 153 + | `pcm_airplay_set_host` | Set a single receiver (clears any previous list) | 154 + | `pcm_airplay_add_receiver` | Append one receiver to the multi-room list | 155 + | `pcm_airplay_clear_receivers` | Clear the receiver list before re-configuring | 156 + | `pcm_airplay_connect` | Open RTSP + RTP sessions for all configured receivers | 157 + | `pcm_airplay_write` | Buffer PCM, encode ALAC once, fan out to every receiver | 158 + | `pcm_airplay_stop` | Send TEARDOWN to all, clear session | 159 + | `pcm_airplay_close` | Same as stop (called on sink switch) | 143 160 144 - `HOST` is a `Mutex<Option<String>>` and `PORT` is an `AtomicU16` (default 145 - 5000). `SESSION` is a `Mutex<Option<AirPlaySession>>` — the session is 146 - created once and reused across `write` calls for the lifetime of a track. 161 + `SESSION` is a `Mutex<Option<AirPlaySession>>`. `CONFIG` is a 162 + `Mutex<AirPlayConfig>` holding `receivers: Vec<(String, u16)>`. 147 163 148 164 ### Force-link shim 149 165 ··· 155 171 use rockbox_airplay::_link_airplay as _; 156 172 ``` 157 173 158 - where `_link_airplay` is a public no-op function in `lib.rs`. This is enough 159 - to pull the entire crate into the link graph. 174 + where `_link_airplay` is a public no-op function in `lib.rs`. 160 175 161 176 --- 162 177 ··· 168 183 ``` 169 184 if SESSION is already Some → return OK immediately (idempotent) 170 185 171 - 1. Probe AirPlay 2 (non-fatal — logs and falls through on failure) 172 - 2. RtpSender::bind(host, ports) ← binds three UDP sockets 173 - 3. RtspClient::new(host, port) ← opens TCP connection to receiver 174 - 4. rtsp.announce(sdp) ← sends SDP describing the ALAC stream 175 - 5. rtsp.setup(transport) ← negotiates UDP port numbers 176 - 6. rtsp.record() ← starts the session 177 - 7. sender.send_initial_sync() ← sends first RTCP sync packet 178 - 8. SESSION = Some(AirPlaySession { sender, rtsp, buf: [], first_frame: true }) 186 + 1. Read receiver list from CONFIG 187 + 2. TimingSocket::bind() ← one shared NTP timing port + responder thread 188 + 3. Choose shared initial_rtptime ← same value for ALL receivers (sync anchor) 189 + 4. For each configured receiver: 190 + a. connect_one(host, port, initial_rtptime, timing_port) 191 + ├── Probe AirPlay 2 (non-fatal) 192 + ├── ReceiverHandle::bind() ← audio_sock + ctrl_sock 193 + ├── RtspClient::connect() ← TCP to receiver 194 + ├── rtsp.announce(sdp) ← SDP with ALAC params 195 + ├── rtsp.setup(ctrl, timing) ← get server UDP ports 196 + ├── rx.connect(audio, ctrl) ← connect audio_sock 197 + ├── rtsp.record(seq=0, ts) ← start stream 198 + └── rtsp.set_parameter_volume(0.0) 199 + b. On failure: log warning, continue (partial success OK) 200 + 5. Abort only if ZERO receivers connected 201 + 6. session.send_initial_sync() ← RTCP sync to all receivers 202 + 7. SESSION = Some(AirPlaySession { receivers, rtsp_clients, timing, pacing, … }) 179 203 ``` 180 204 181 - `pcm_airplay_write(data, len)` appends the incoming PCM bytes to `buf`, then 182 - drains complete 352-sample (1408-byte) frames in a loop: 205 + `pcm_airplay_write(data, len)` accumulates PCM in `buf`, then for each 206 + complete 352-sample frame: 183 207 184 208 ```rust 185 - while buf.len() >= FRAME_SIZE: 186 - frame_pcm = buf.drain(..FRAME_SIZE) 187 - alac_frame = alac::encode_frame(&frame_pcm) 188 - sender.send_audio(&alac_frame, first_frame) 189 - first_frame = false 209 + alac = encode_frame(&frame_bytes) // encode ONCE 210 + 211 + for rx in &mut receivers: 212 + rx.send_audio_packet(&alac, rtptime, …) // send to EACH receiver 213 + 214 + pacing.advance() // increment rtptime + frames_sent 215 + if frames_sent % 44 == 0: 216 + for rx: rx.send_sync(current_ts, next_ts, false) 217 + 218 + pacing.pace() // sleep ONCE for all rooms 190 219 ``` 191 220 192 - `pcm_airplay_stop()` sends RTSP TEARDOWN and sets `SESSION = None`. 221 + `pcm_airplay_stop()` sends RTSP TEARDOWN to every receiver, then sets 222 + `SESSION = None`. 193 223 194 224 --- 195 225 196 226 ## RTSP handshake (`rtsp.rs`) 197 227 198 - `RtspClient` speaks synchronous RTSP over a single TCP connection. The full 199 - exchange for one session is: 228 + `RtspClient` speaks synchronous RTSP over a single TCP connection **per 229 + receiver**. The TCP connection is kept alive in `AirPlaySession.rtsp_clients` 230 + for the duration of the track — dropping it would cause the receiver to detect 231 + EOF and tear down its audio socket. 200 232 201 233 ### 1. ANNOUNCE 202 234 ··· 211 243 m=audio 0 RTP/AVP 96 212 244 a=rtpmap:96 AppleLossless 213 245 a=fmtp:96 352 0 16 40 10 14 2 255 0 0 44100 246 + a=min-latency:3528 214 247 ``` 215 248 216 - The `fmtp` parameters encode: 217 - `<frames_per_packet> <version> <bit_depth> <rice_history_mult> 249 + The `fmtp` parameters: `<frames/pkt> <version> <bit_depth> <rice_history_mult> 218 250 <rice_initial_history> <rice_limit> <channels> <max_run> <max_frame_bytes> 219 - <avg_bit_rate> <sample_rate>` 251 + <avg_bit_rate> <sample_rate>`. 252 + 253 + No `a=rsaaeskey` line — encryption is disabled. The receiver sets 254 + `stream.encrypted = 0` and passes frames straight to the ALAC decoder. 220 255 221 256 ### 2. SETUP 222 257 223 - Sends a `Transport` header requesting UDP: 258 + Requests UDP transport, advertising our local ctrl and timing ports: 224 259 225 260 ``` 226 - Transport: RTP/AVP/UDP;unicast;interleaved=0-1; 227 - client_port=<audio_port>-<ctrl_port> 261 + Transport: RTP/AVP/UDP;unicast;interleaved=0-1;mode=record; 262 + control_port=<local_ctrl>;timing_port=<shared_timing> 228 263 ``` 229 264 230 - `interleaved=0-1` is required by many receivers even though the transport is 231 - UDP (not RTSP interleaved). The response carries the server's UDP port pair, 232 - extracted by `parse_port()`. 265 + All receivers are advertised the **same** `timing_port` (the shared 266 + `TimingSocket`). The response carries the server's audio, ctrl, and timing 267 + ports, extracted by `parse_port()`. 233 268 234 269 ### 3. RECORD 235 270 236 - Starts the stream. Sends `RTP-Info` with sequence number and RTP timestamp. 271 + Starts the stream. Sends `RTP-Info` with sequence number 0 and the shared 272 + `initial_rtptime`. 237 273 238 274 ### 4. SET_PARAMETER (volume) 239 275 240 - Sets playback volume. Sent as a float string in a `text/parameters` body: 241 - `volume: -20.0` (range −144 to 0; 0 is full volume). 276 + Sets playback volume to maximum (0.0 in RAOP's −144…0 range). 242 277 243 278 ### 5. TEARDOWN 244 279 245 - Gracefully terminates the session. Called from `pcm_airplay_stop()`. 280 + Gracefully terminates the session. Called per-receiver from 281 + `pcm_airplay_stop()`. 246 282 247 283 --- 248 284 249 285 ## ALAC encoding (`alac.rs`) 250 286 251 - `encode_frame(samples: &[i16])` encodes exactly **352 stereo S16LE samples** 287 + `encode_frame(pcm: &[u8])` encodes exactly **352 stereo S16LE samples** 252 288 (1408 bytes of PCM) into an ALAC verbatim ("uncompressed escape") frame. 253 289 254 290 ### Frame format 255 291 256 - The Hammerton ALAC decoder expects this exact bit layout: 292 + The Hammerton ALAC decoder (used by shairport-sync) expects this exact bit 293 + layout — note there is **no** 4-bit element-instance tag after the channel 294 + field: 257 295 258 296 ``` 259 - Bits Width Field 260 - 0–2 3 channels − 1 (= 1 for stereo) 261 - 3–6 4 discarded (0) 262 - 7–18 12 discarded (0) 263 - 19 1 hassize = 0 264 - 20–23 4 uncompressed_bytes = 0 265 - 24 1 isNotCompressed = 1 ← verbatim frame flag 266 - 25+ 32 each sample as big-endian signed 16-bit, left then right 297 + Bits Width Field 298 + 0–2 3 channels − 1 (= 1 for stereo) 299 + 3–6 4 output_waiting — read and discarded 300 + 7–18 12 unknown — read and discarded 301 + 19 1 hassize = 0 302 + 20–21 2 uncompressed_bytes = 0 303 + 22 1 isNotCompressed = 1 ← verbatim frame flag 304 + 23+ 32 each sample as big-endian signed 16-bit, left then right 305 + (352 × L + R pairs = 22,528 bits) 267 306 ``` 268 307 269 - Output size = 4 bytes header + 352 × 2 channels × 2 bytes/sample 270 - = **1412 bytes** (rounded up to byte boundary). 308 + Total: 23 header bits + 352 × 32 sample bits = 11,287 bits → **1411 bytes** 309 + (padded to byte boundary with no END tag). 271 310 272 311 ### BitWriter 273 312 274 - `BitWriter` accumulates bits MSB-first into a `Vec<u8>`: 313 + `BitWriter` accumulates bits MSB-first into a `[u8; 1411]` buffer: 275 314 276 315 ```rust 277 - fn write(&mut self, value: u64, nbits: u32) 316 + fn write(&mut self, value: u32, nbits: usize) 278 317 fn align(&mut self) // zero-pad to next byte boundary 279 318 ``` 280 319 281 - The encoder calls `write` for the 25-bit header fields and then for each 282 - sample (16 bits per channel, interleaved L/R), then `align()` to flush the 283 - final byte. 284 - 285 320 --- 286 321 287 322 ## RTP audio stream (`rtp.rs`) 288 323 289 - `RtpSender` opens **three UDP sockets** at construction time: 324 + Three types in `rtp.rs` handle the per-receiver and shared concerns: 325 + 326 + ### `ReceiverHandle` — per receiver 290 327 291 - | Socket | Direction | Purpose | 292 - |---------------|-------------------------|---------------------| 293 - | `audio_sock` | → receiver audio port | RTP audio frames | 294 - | `ctrl_sock` | ↔ receiver control port | RTCP sync packets | 295 - | `timing_sock` | ↔ receiver timing port | NTP timing exchange | 328 + Owns the two UDP sockets for one AirPlay endpoint: 296 329 297 - ### `send_audio(frame, marker)` 330 + | Socket | Direction | Purpose | 331 + |--------------|-------------------------|---------------------| 332 + | `audio_sock` | → receiver audio port | RTP audio frames | 333 + | `ctrl_sock` | ↔ receiver control port | RTCP sync packets | 298 334 299 - Builds a 12-byte RTP header: 335 + Also holds `ssrc` (random per receiver) and `seqnum` (wrapping u16). 336 + 337 + `send_audio_packet(alac_frame, rtptime, frame_index, first)` builds and sends 338 + one 12-byte RTP packet: 300 339 301 340 ``` 302 341 0 1 2 3 303 342 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 304 343 ├─┤─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┼─┤ 305 - │V=2│P│X│ CC │M│ PT=96 │ Sequence Number │ 306 - ├───────────────────────────────┼─────────────────────────────┤ 307 - │ Timestamp (RTP clock units) │ 308 - ├─────────────────────────────────────────────────────────────┤ 309 - │ SSRC │ 310 - └─────────────────────────────────────────────────────────────┘ 344 + │V=2│P│X│ CC │M│ PT=96 │ Sequence Number │ 345 + ├───────────────────────────────┼───────────────────────────────┤ 346 + │ Timestamp (shared rtptime — same for all receivers) │ 347 + ├────────────────────────────────────────────────────────────────┤ 348 + │ SSRC (per-receiver random u32) │ 349 + └────────────────────────────────────────────────────────────────┘ 311 350 ``` 312 351 313 - - `M` (marker) = 1 on the first frame of a session, 0 thereafter. 314 - - Timestamp increments by **352** per frame (one ALAC frame = 352 samples). 315 - - SSRC is a random 32-bit value chosen at sender creation. 352 + - `M` (marker) = 1 on the first frame of a session only. 353 + - Timestamp is **shared** across all receivers — all rooms decode the same 354 + logical frame position. 355 + 356 + ### `TimingSocket` — shared 357 + 358 + One UDP socket bound to a random port. All receivers are told this single 359 + port in SETUP. `timing_responder` (a background thread) answers any PT=0xD2 360 + timing request from any source with a PT=0xD3 response containing the current 361 + NTP time. 362 + 363 + ### `PacingClock` — shared 364 + 365 + Tracks `stream_start` (an `Instant`), `frames_sent`, and the current `rtptime`. 366 + After all receivers have been sent a frame, `pace()` sleeps until the frame's 367 + wall-clock deadline: 368 + 369 + ```rust 370 + let expected = stream_start + frames_sent × FRAME_DURATION_US; 371 + if expected > Instant::now() { thread::sleep(expected - now); } 372 + ``` 316 373 317 - **Real-time pacing**: `send_audio` tracks the expected transmission instant 318 - using `Instant` and `frame_count × Duration_per_frame` and calls 319 - `thread::sleep` when the sender is running ahead. 374 + `FRAME_DURATION_US = 352 × 1_000_000 / 44100 ≈ 7982 µs`. 320 375 321 376 --- 322 377 323 378 ## RTCP synchronisation 324 379 325 - `send_sync(first)` sends a 20-byte RTCP NTP Send Report to the control socket 326 - every **44 frames** (~350 ms at 44100 Hz): 380 + `ReceiverHandle::send_sync(current_ts, next_ts, first)` sends a 20-byte RTCP 381 + packet on the ctrl socket every **44 frames** (~350 ms at 44100 Hz): 327 382 328 383 ``` 329 - Byte Field 330 - 0 V=2, P=0, RC=0 331 - 1 PT=200 (SR) or 0xD4 (first sync) 332 - 2–3 length = 4 (words after fixed header) 333 - 4–7 SSRC 334 - 8–11 NTP timestamp seconds (since 1900-01-01) 335 - 12–15 NTP timestamp fraction (2^32 units) 336 - 16–19 RTP timestamp (matching the next audio frame's timestamp) 384 + Byte Field 385 + 0 0x80 (normal) or 0x90 (first sync, extension bit set) 386 + 1 0xD4 (PT=212, Apple proprietary sync) 387 + 2–3 0x0007 (length field) 388 + 4–7 current RTP timestamp (frame just sent) 389 + 8–11 NTP seconds (since 1900-01-01, = UNIX_time + 0x83AA7E80) 390 + 12–15 NTP fraction (2^32 units per second) 391 + 16–19 next RTP timestamp (next frame to be sent) 337 392 ``` 338 393 339 - `NTP_EPOCH_DELTA = 0x83AA_7E80` converts UNIX time (seconds since 1970) to NTP 340 - time (seconds since 1900). 341 - 342 - The first sync packet (`first=true`) uses PT=`0xD4` (not standard SR) — some 343 - receivers require this to accept the initial synchronisation. 394 + `current_ts` and `next_ts` are derived from the shared `PacingClock.rtptime`, 395 + so all receivers receive consistent timestamps. 344 396 345 397 --- 346 398 347 399 ## NTP timing responder 348 400 349 - A background thread (`timing_responder`) listens on `timing_sock` and answers 350 - NTP timing requests from the receiver: 401 + A single background thread (`timing_responder`) listens on the shared 402 + `TimingSocket` and answers NTP timing requests from **all** receivers: 351 403 352 404 ``` 353 - Request PT = 0xD2 (timing request) 405 + Request PT = 0xD2 (timing request, from any receiver) 354 406 Response PT = 0xD3 (timing response) 355 407 356 - Response body (32 bytes): 357 - [0–3] SSRC 358 - [4–7] 0 (reference seconds) 359 - [8–11] 0 (reference fraction) 360 - [12–15] received seconds (echoed from request) 361 - [16–19] received fraction (echoed from request) 362 - [20–23] send seconds (current NTP time) 363 - [24–27] send fraction (current NTP time) 408 + Response layout (32 bytes): 409 + [0] 0x80 410 + [1] 0xD3 411 + [2–3] sequence number (copied from request) 412 + [4–7] padding (zero) 413 + [8–15] reference NTP (zero) 414 + [16–23] originate NTP (copied from request bytes [16–23]) 415 + [24–31] receive/transmit NTP (current system time) 364 416 ``` 365 417 366 - Many receivers stall playback if timing responses stop arriving. The thread 367 - runs for the entire duration of the session. 418 + Using one socket for all receivers works because the responder uses 419 + `send_to(src)` to reply to the exact source address of each request. 420 + 421 + --- 422 + 423 + ## Multi-room fan-out 424 + 425 + The complete per-frame processing path in `AirPlaySession::send_frame()`: 426 + 427 + ``` 428 + 1. encode_frame(&pcm) → alac: [u8; 1411] (once, ~5 µs) 429 + 2. for rx in receivers: 430 + rx.send_audio_packet(&alac, …) → UDP send (per receiver, ~1 µs each) 431 + 3. pacing.advance() → increment rtptime, frames_sent 432 + 4. if frames_sent % 44 == 0: 433 + for rx in receivers: 434 + rx.send_sync(…) → RTCP UDP send (per receiver) 435 + 5. pacing.pace() → thread::sleep (once, ~7982 µs avg) 436 + ``` 437 + 438 + With N receivers, steps 2 and 4 take O(N) sequential UDP sends (~1–2 µs each). 439 + Even with 10 receivers the added latency (~20 µs) is negligible compared to 440 + the 7982 µs frame budget. 441 + 442 + ### Sync accuracy 443 + 444 + All receivers share the same `initial_rtptime` and receive each frame within 445 + the same loop iteration (a few microseconds apart). Their playout timestamps 446 + are identical. Actual synchronisation accuracy is bounded by: 447 + - Receiver buffer depth (typically 1–3 s for shairport-sync) 448 + - NTP timing exchange accuracy (usually < 5 ms on LAN) 449 + 450 + This gives **AirPlay 1-level sync** — adequate for multi-room on a LAN. 451 + Sample-accurate sync across rooms requires AirPlay 2's clock-anchoring, which 452 + is a different protocol. 453 + 454 + ### Partial failure 455 + 456 + If one receiver fails to connect during `pcm_airplay_connect()`, the error is 457 + logged at `warn` level and the session continues with the remaining receivers. 458 + The session is only aborted when **zero** receivers connect successfully. 368 459 369 460 --- 370 461 ··· 372 463 373 464 When Rockbox moves to the next track: 374 465 375 - 1. `sink_dma_stop()` is called → `pcm_airplay_stop()` → RTSP TEARDOWN → 376 - `SESSION = None`. 377 - 2. `sink_dma_start()` is called for the new track → `pcm_airplay_connect()` → 378 - new RTSP session with fresh RTP sequence/timestamp counters. 466 + 1. `sink_dma_stop()` → `pcm_airplay_stop()` → RTSP TEARDOWN on every receiver 467 + → `SESSION = None`. 468 + 2. `sink_dma_start()` → `pcm_airplay_connect()` → new RTSP sessions with 469 + fresh RTP sequence/timestamp counters and a new random `initial_rtptime`. 379 470 380 471 There is a brief gap (TEARDOWN round-trip + new ANNOUNCE/SETUP/RECORD) between 381 - tracks. This is inherent to RAOP and is typically inaudible (<100 ms). 472 + tracks, inherent to RAOP and typically inaudible (< 100 ms). 382 473 383 474 --- 384 475 385 476 ## Configuration 386 477 387 - In `~/.config/rockbox.org/settings.toml`: 478 + ### Single receiver (backward-compatible) 388 479 389 480 ```toml 390 481 audio_output = "airplay" 391 - airplay_host = "192.168.1.x" # IP of the AirPlay receiver 482 + airplay_host = "192.168.1.50" # IP of the AirPlay receiver 392 483 airplay_port = 5000 # optional, default 5000 393 484 ``` 394 485 395 - `crates/settings/src/lib.rs:load_settings()` reads these values and calls: 486 + ### Multi-room 396 487 397 - ```rust 398 - pcm::airplay_set_host(&host, port); 399 - pcm::switch_sink(PCM_SINK_AIRPLAY); 488 + ```toml 489 + audio_output = "airplay" 490 + 491 + [[airplay_receivers]] 492 + host = "192.168.1.50" 493 + port = 5000 # optional, default 5000 494 + 495 + [[airplay_receivers]] 496 + host = "192.168.1.51" 497 + 498 + [[airplay_receivers]] 499 + host = "192.168.1.52" 500 + port = 5001 400 501 ``` 401 502 402 - `airplay_set_host` stores the host in `HOST: Mutex<Option<String>>` and the 403 - port in `PORT: AtomicU16`. These are read by `pcm_airplay_connect()` at the 404 - start of each track. 503 + `airplay_receivers` takes precedence over `airplay_host`/`airplay_port` when 504 + both are present. `crates/settings/src/lib.rs` calls 505 + `pcm_airplay_clear_receivers()` then `pcm_airplay_add_receiver()` for each 506 + entry. 507 + 508 + ### Runtime control 509 + 510 + The Rust FFI also exposes: 511 + 512 + ```rust 513 + pcm::airplay_set_host("192.168.1.50", 5000); // replace list with one receiver 514 + pcm::airplay_add_receiver("192.168.1.51", 5000); // append to list 515 + pcm::airplay_clear_receivers(); // clear before re-configuring 516 + ``` 405 517 406 518 --- 407 519 408 520 ## AirPlay 2 probe 409 521 410 - `pcm_airplay_connect()` first attempts an AirPlay 2 handshake (PTP-based). If 411 - it fails (connection refused, or the receiver does not support AirPlay 2) the 412 - error is logged at `tracing::debug!` level and the function falls through to the 413 - AirPlay 1 / RAOP path. This makes the probe transparent to the user. 522 + `connect_one()` first attempts an AirPlay 2 handshake (HAP-based). If it 523 + fails the error is logged at `tracing::debug!` and the function falls through 524 + to the AirPlay 1 / RAOP path. This makes the probe transparent to the user. 414 525 415 - The AirPlay 2 path uses the cryptographic dependencies declared in 416 - `Cargo.toml`: 526 + The AirPlay 2 path uses: 417 527 418 528 ```toml 419 - x25519-dalek # key exchange 420 - ed25519-dalek # signature 421 - chacha20poly1305 # AEAD encryption 529 + x25519-dalek # ephemeral key exchange (PAIR-VERIFY) 530 + ed25519-dalek # long-term identity signature 531 + chacha20poly1305 # AEAD encryption of the identity payload 422 532 sha2, hkdf, hmac # key derivation 423 - num-bigint # SRP big-integer arithmetic 533 + num-bigint # SRP 3072-bit big-integer arithmetic (PAIR-SETUP) 424 534 ``` 425 535 426 536 None of these are needed for the AirPlay 1 code path. ··· 429 539 430 540 ## Gotchas and known limits 431 541 432 - ### 1. Only one simultaneous receiver 433 - 434 - The `SESSION` mutex holds a single `AirPlaySession`. Sending to multiple 435 - AirPlay devices simultaneously is not supported. For multi-room output use 436 - the Squeezelite sink with multiple clients, or run multiple rockboxd instances. 437 - 438 - ### 2. Receiver must be on the local network 542 + ### 1. Receiver must be reachable via UDP 439 543 440 - RAOP uses UDP with no NAT traversal. The receiver must be directly reachable 441 - at the configured IP. Multicast discovery (mDNS/Bonjour) is not implemented — 442 - you must supply the IP manually. 544 + RAOP uses UDP with no NAT traversal. Every configured receiver must be 545 + directly reachable at its IP from the machine running rockboxd. Multicast 546 + discovery (mDNS/Bonjour) is not implemented — supply the IP manually. 443 547 444 - ### 3. `interleaved=0-1` in Transport header 548 + ### 2. `interleaved=0-1` in Transport header 445 549 446 550 Even though the transport is plain UDP, most receivers require the 447 - `interleaved=0-1` parameter in the SETUP `Transport` header. Omitting it causes 448 - the receiver to ignore the `RECORD` command silently. 551 + `interleaved=0-1` parameter in the SETUP `Transport` header. Omitting it 552 + causes the receiver to silently ignore the `RECORD` command. 449 553 450 - ### 4. Verbatim ALAC only (no compression) 554 + ### 3. Verbatim ALAC only (no compression) 451 555 452 556 `alac.rs` only implements the verbatim escape frame (`isNotCompressed=1`). 453 557 Bitrate is fixed at `sample_rate × 4 bytes/s = 176,400 bytes/s` at 44.1 kHz. 454 - This is fine for LAN streaming but wasteful compared to the compressed ALAC 455 - path. 558 + Fine for LAN streaming but higher than compressed ALAC. 559 + 560 + ### 4. Fixed 44100 Hz sample rate 456 561 457 - ### 5. Fixed 44100 Hz sample rate 562 + The SDP and ALAC frame size constants are hard-coded for 44100 Hz. Playback 563 + of 48 kHz or 96 kHz tracks is not tested. 564 + 565 + ### 5. Multi-room sync is LAN-quality, not sample-accurate 458 566 459 - The RTSP SDP and ALAC frame size constants are hard-coded for 44100 Hz. 460 - Playback of 48 kHz or 96 kHz tracks is not tested and may produce incorrect 461 - pitch or receiver errors. 567 + See [Sync accuracy](#sync-accuracy). AirPlay 2-level clock anchoring is not 568 + implemented. 462 569 463 570 ### 6. Logging uses `tracing`, never `println!` 464 571 465 - All diagnostic output is routed through the `tracing` crate. To see the full 466 - AirPlay negotiation: 572 + All diagnostic output is routed through the `tracing` crate: 467 573 468 574 ```sh 469 - RUST_LOG=rockbox_airplay=debug rockboxd 575 + RUST_LOG=rockbox_airplay=debug rockboxd # full protocol trace 576 + RUST_LOG=info rockboxd # lifecycle events only 470 577 ``` 471 578 472 - Never add `println!` or `eprintln!` — those bypass the log filter and pollute 473 - stdout, breaking FIFO/pipe mode. 579 + Never add `println!` or `eprintln!` — those bypass the log filter and can 580 + corrupt the stdout PCM stream in FIFO mode.
+191 -84
crates/airplay/src/lib.rs
··· 7 7 #[doc(hidden)] 8 8 pub fn _link_airplay() {} 9 9 10 - use alac::{encode_frame, PCM_BYTES_PER_FRAME}; 11 - use rtp::RtpSender; 10 + use alac::{encode_frame, FRAME_SAMPLES, PCM_BYTES_PER_FRAME}; 11 + use rtp::{PacingClock, ReceiverHandle, TimingSocket}; 12 12 use rtsp::RtspClient; 13 13 14 14 use std::ffi::CStr; 15 15 use std::os::raw::{c_char, c_int, c_ushort}; 16 16 use std::sync::Mutex; 17 17 18 + // --------------------------------------------------------------------------- 19 + // Global state 20 + // --------------------------------------------------------------------------- 21 + 18 22 static SESSION: Mutex<Option<AirPlaySession>> = Mutex::new(None); 19 23 20 24 struct AirPlaySession { 21 - sender: RtpSender, 22 - rtsp: RtspClient, 25 + receivers: Vec<ReceiverHandle>, 26 + timing: TimingSocket, 27 + rtsp_clients: Vec<RtspClient>, 23 28 buf: Vec<u8>, 24 29 first_frame: bool, 30 + pacing: PacingClock, 25 31 } 26 32 33 + impl AirPlaySession { 34 + /// Encode one ALAC frame and fan it out to every connected receiver. 35 + fn send_frame(&mut self, frame_bytes: &[u8; PCM_BYTES_PER_FRAME], first: bool) { 36 + let alac = encode_frame(frame_bytes); 37 + let rtptime = self.pacing.rtptime; 38 + let frame_index = self.pacing.frames_sent; 39 + 40 + for rx in &mut self.receivers { 41 + rx.send_audio_packet(&alac, rtptime, frame_index, first); 42 + } 43 + 44 + self.pacing.advance(); 45 + 46 + // RTCP NTP sync every ~44 frames (~0.35 s) 47 + if self.pacing.frames_sent % 44 == 0 { 48 + let current_ts = self.pacing.rtptime.wrapping_sub(FRAME_SAMPLES as u32); 49 + let next_ts = self.pacing.rtptime; 50 + for rx in &self.receivers { 51 + rx.send_sync(current_ts, next_ts, false); 52 + } 53 + } 54 + 55 + // Pace once for all receivers 56 + self.pacing.pace(); 57 + } 58 + 59 + fn send_initial_sync(&self) { 60 + let ts = self.pacing.initial_rtptime; 61 + for rx in &self.receivers { 62 + rx.send_sync(ts, ts, true); 63 + } 64 + tracing::debug!( 65 + "sent initial sync ts={} to {} receiver(s)", 66 + ts, 67 + self.receivers.len() 68 + ); 69 + } 70 + } 71 + 72 + // --------------------------------------------------------------------------- 73 + // Config 74 + // --------------------------------------------------------------------------- 75 + 27 76 static CONFIG: Mutex<AirPlayConfig> = Mutex::new(AirPlayConfig { 28 - host: None, 29 - port: 5000, 77 + receivers: Vec::new(), 30 78 }); 31 79 32 80 struct AirPlayConfig { 33 - host: Option<String>, 34 - port: u16, 81 + receivers: Vec<(String, u16)>, 35 82 } 36 83 37 - // Safety: the raw pointer in host is only touched inside the mutex 84 + // Safety: Vec<(String, u16)> is Send 38 85 unsafe impl Send for AirPlayConfig {} 39 86 87 + // --------------------------------------------------------------------------- 88 + // FFI — configuration 89 + // --------------------------------------------------------------------------- 90 + 91 + /// Set a single AirPlay receiver, replacing any previously configured list. 92 + /// Kept for backward compatibility with existing C callers and settings. 40 93 #[no_mangle] 41 94 pub extern "C" fn pcm_airplay_set_host(host: *const c_char, port: c_ushort) { 42 95 if host.is_null() { ··· 46 99 .to_string_lossy() 47 100 .into_owned(); 48 101 let mut cfg = CONFIG.lock().unwrap(); 49 - cfg.host = Some(s); 50 - cfg.port = port; 102 + cfg.receivers.clear(); 103 + cfg.receivers.push((s, port)); 104 + } 105 + 106 + /// Append one receiver to the multi-room list. 107 + #[no_mangle] 108 + pub extern "C" fn pcm_airplay_add_receiver(host: *const c_char, port: c_ushort) { 109 + if host.is_null() { 110 + return; 111 + } 112 + let s = unsafe { CStr::from_ptr(host) } 113 + .to_string_lossy() 114 + .into_owned(); 115 + let mut cfg = CONFIG.lock().unwrap(); 116 + cfg.receivers.push((s, port)); 117 + } 118 + 119 + /// Clear the receiver list (call before re-configuring). 120 + #[no_mangle] 121 + pub extern "C" fn pcm_airplay_clear_receivers() { 122 + CONFIG.lock().unwrap().receivers.clear(); 51 123 } 52 124 125 + // --------------------------------------------------------------------------- 126 + // FFI — session lifecycle 127 + // --------------------------------------------------------------------------- 128 + 53 129 #[no_mangle] 54 130 pub extern "C" fn pcm_airplay_connect() -> c_int { 55 - // Already connected — don't redo the RTSP handshake for every DMA chunk. 56 131 if SESSION.lock().unwrap().is_some() { 57 - return 0; 132 + return 0; // idempotent 58 133 } 59 134 60 - let cfg = CONFIG.lock().unwrap(); 61 - let host = match cfg.host.clone() { 62 - Some(h) => h, 63 - None => { 64 - tracing::error!("pcm_airplay_connect: no host configured"); 135 + let targets = { 136 + let cfg = CONFIG.lock().unwrap(); 137 + if cfg.receivers.is_empty() { 138 + tracing::error!("pcm_airplay_connect: no receivers configured"); 65 139 return -1; 66 140 } 141 + cfg.receivers.clone() 67 142 }; 68 - let port = cfg.port; 69 - drop(cfg); 70 143 71 - let local_ip = local_ip_for(&host).unwrap_or_else(|| "127.0.0.1".to_string()); 72 - tracing::info!("connecting to {}:{} (local_ip={})", host, port, local_ip); 73 - 74 - // Attempt AirPlay 2 pairing (PAIR-VERIFY / PAIR-SETUP). 75 - // Failure here is non-fatal — many AirPlay 1 receivers don't have the endpoint. 76 - match airplay2::connect(&host, port, None) { 77 - Ok(()) => tracing::info!("AirPlay 2 handshake complete"), 78 - Err(e) => tracing::debug!("AirPlay 2 handshake skipped ({}), using AirPlay 1", e), 79 - } 80 - 81 - let session_token: u64 = rand::random(); 82 - let ssrc: u32 = rand::random(); 144 + // All receivers share the same initial_rtptime for RTP-level synchronisation. 83 145 let initial_rtptime: u32 = rand::random(); 84 146 85 - // Bind all UDP sockets first so we know the local ports before SETUP. 86 - let mut sender = match RtpSender::bind(ssrc, initial_rtptime) { 87 - Ok(s) => s, 147 + // Bind the shared timing socket first; every receiver advertises the same port. 148 + let timing = match TimingSocket::bind() { 149 + Ok(t) => t, 88 150 Err(e) => { 89 - tracing::error!("bind failed: {}", e); 151 + tracing::error!("timing socket bind failed: {}", e); 90 152 return -1; 91 153 } 92 154 }; 93 - let local_ctrl_port = sender.local_ctrl_port; 94 - let local_timing_port = sender.local_timing_port; 155 + let local_timing_port = timing.local_port; 156 + 157 + let mut receivers: Vec<ReceiverHandle> = Vec::new(); 158 + let mut rtsp_clients: Vec<RtspClient> = Vec::new(); 159 + let mut connected = 0usize; 95 160 96 - let mut rtsp = match RtspClient::connect(&host, port, session_token) { 97 - Ok(c) => c, 98 - Err(e) => { 99 - tracing::error!("RTSP TCP connect failed: {}", e); 100 - return -1; 161 + for (host, port) in &targets { 162 + match connect_one(host, *port, initial_rtptime, local_timing_port) { 163 + Ok((rx, rtsp)) => { 164 + tracing::info!("connected to {}:{}", host, port); 165 + receivers.push(rx); 166 + rtsp_clients.push(rtsp); 167 + connected += 1; 168 + } 169 + Err(e) => tracing::warn!("failed to connect to {}:{}: {}", host, port, e), 101 170 } 102 - }; 171 + } 103 172 104 - if let Err(e) = rtsp.announce(&local_ip, &host) { 105 - tracing::error!("ANNOUNCE failed: {}", e); 173 + if connected == 0 { 174 + tracing::error!("could not connect to any AirPlay receiver"); 106 175 return -1; 107 176 } 108 177 109 - let (server_audio, server_ctrl, _server_timing) = 110 - match rtsp.setup(local_ctrl_port, local_timing_port) { 111 - Ok(ports) => ports, 112 - Err(e) => { 113 - tracing::error!("SETUP failed: {}", e); 114 - return -1; 115 - } 116 - }; 178 + let pacing = PacingClock::new(initial_rtptime); 179 + let session = AirPlaySession { 180 + receivers, 181 + timing, 182 + rtsp_clients, 183 + buf: Vec::with_capacity(PCM_BYTES_PER_FRAME * 4), 184 + first_frame: true, 185 + pacing, 186 + }; 187 + 188 + session.send_initial_sync(); 189 + tracing::info!( 190 + "session established: {}/{} receiver(s) connected", 191 + connected, 192 + targets.len() 193 + ); 194 + 195 + *SESSION.lock().unwrap() = Some(session); 196 + 0 197 + } 117 198 118 - if let Err(e) = sender.connect_server(&host, server_audio, server_ctrl) { 119 - tracing::error!("connect_server failed: {}", e); 120 - return -1; 199 + /// Connect to a single AirPlay receiver. Returns `(ReceiverHandle, RtspClient)` on success. 200 + fn connect_one( 201 + host: &str, 202 + port: u16, 203 + initial_rtptime: u32, 204 + local_timing_port: u16, 205 + ) -> std::io::Result<(ReceiverHandle, RtspClient)> { 206 + let local_ip = local_ip_for(host).unwrap_or_else(|| "127.0.0.1".to_string()); 207 + 208 + // Attempt AirPlay 2 pairing (non-fatal fallback to AirPlay 1). 209 + match airplay2::connect(host, port, None) { 210 + Ok(()) => tracing::info!("AirPlay 2 handshake complete for {}:{}", host, port), 211 + Err(e) => tracing::debug!( 212 + "AirPlay 2 skipped for {}:{} ({}), using AirPlay 1", 213 + host, 214 + port, 215 + e 216 + ), 121 217 } 122 218 123 - if let Err(e) = rtsp.record(0, initial_rtptime) { 124 - tracing::error!("RECORD failed: {}", e); 125 - return -1; 126 - } 219 + let session_token: u64 = rand::random(); 220 + 221 + let mut rx = ReceiverHandle::bind()?; 222 + let local_ctrl_port = rx.local_ctrl_port; 223 + 224 + let mut rtsp = RtspClient::connect(host, port, session_token)?; 225 + rtsp.announce(&local_ip, host)?; 226 + 227 + let (server_audio, server_ctrl, _server_timing) = 228 + rtsp.setup(local_ctrl_port, local_timing_port)?; 229 + 230 + rx.connect(host, server_audio, server_ctrl)?; 231 + rtsp.record(0, initial_rtptime)?; 127 232 128 233 // Set volume to maximum; RAOP range: -144.0 (mute) to 0.0 (full). 129 234 if let Err(e) = rtsp.set_parameter_volume(0.0) { 130 - tracing::warn!("SET_PARAMETER volume failed (non-fatal): {}", e); 235 + tracing::warn!( 236 + "SET_PARAMETER volume failed for {}:{} (non-fatal): {}", 237 + host, 238 + port, 239 + e 240 + ); 131 241 } 132 242 133 - sender.send_initial_sync(); 134 - tracing::info!( 135 - "session established — sending audio to {}:{}", 136 - host, 137 - server_audio 138 - ); 139 - 140 - let mut guard = SESSION.lock().unwrap(); 141 - *guard = Some(AirPlaySession { 142 - sender, 143 - rtsp, 144 - buf: Vec::with_capacity(PCM_BYTES_PER_FRAME * 4), 145 - first_frame: true, 146 - }); 147 - 148 - 0 243 + Ok((rx, rtsp)) 149 244 } 150 245 151 - /// Write raw S16LE stereo PCM. Buffers into 352-sample frames, encodes ALAC, sends RTP. 246 + /// Write raw S16LE stereo PCM. Buffers into 352-sample frames, encodes ALAC, 247 + /// fans out to every connected receiver, then paces once. 152 248 #[no_mangle] 153 249 pub extern "C" fn pcm_airplay_write(data: *const u8, len: usize) -> c_int { 154 250 if data.is_null() || len == 0 { ··· 166 262 }; 167 263 168 264 if session.first_frame { 169 - tracing::debug!("first write: {} bytes", len); 265 + tracing::debug!( 266 + "first write: {} bytes, {} receiver(s)", 267 + len, 268 + session.receivers.len() 269 + ); 170 270 } 171 271 172 272 session.buf.extend_from_slice(input); ··· 176 276 session.buf[..PCM_BYTES_PER_FRAME].try_into().unwrap(); 177 277 session.buf.drain(..PCM_BYTES_PER_FRAME); 178 278 179 - let alac = encode_frame(&frame_bytes); 180 279 let first = session.first_frame; 181 280 session.first_frame = false; 182 - session.sender.send_audio(&alac, first); 281 + session.send_frame(&frame_bytes, first); 183 282 } 184 283 185 284 0 ··· 189 288 pub extern "C" fn pcm_airplay_stop() { 190 289 let mut guard = SESSION.lock().unwrap(); 191 290 if let Some(ref mut session) = *guard { 192 - let _ = session.rtsp.teardown(); 291 + for rtsp in &mut session.rtsp_clients { 292 + let _ = rtsp.teardown(); 293 + } 193 294 } 194 295 *guard = None; 195 296 } ··· 198 299 pub extern "C" fn pcm_airplay_close() { 199 300 let mut guard = SESSION.lock().unwrap(); 200 301 if let Some(ref mut session) = *guard { 201 - let _ = session.rtsp.teardown(); 302 + for rtsp in &mut session.rtsp_clients { 303 + let _ = rtsp.teardown(); 304 + } 202 305 } 203 306 *guard = None; 204 307 } 308 + 309 + // --------------------------------------------------------------------------- 310 + // Helpers 311 + // --------------------------------------------------------------------------- 205 312 206 313 fn local_ip_for(remote: &str) -> Option<String> { 207 314 use std::net::UdpSocket;
+84 -108
crates/airplay/src/rtp.rs
··· 11 11 const NTP_EPOCH_DELTA: u32 = 0x83AA_7E80; 12 12 13 13 // Duration of one ALAC frame at 44100 Hz 14 - const FRAME_DURATION_US: u64 = FRAME_SAMPLES as u64 * 1_000_000 / 44100; // ~7982 µs 14 + pub const FRAME_DURATION_US: u64 = FRAME_SAMPLES as u64 * 1_000_000 / 44100; // ~7982 µs 15 15 16 - pub struct RtpSender { 16 + /// Per-receiver UDP state. One of these per AirPlay endpoint. 17 + pub struct ReceiverHandle { 17 18 audio_sock: UdpSocket, 18 19 ctrl_sock: UdpSocket, 19 20 server_ctrl_addr: std::net::SocketAddr, 20 - ssrc: u32, 21 - seqnum: u16, 22 - rtptime: u32, 23 - initial_rtptime: u32, 24 - frames_sent: u64, 25 21 pub local_ctrl_port: u16, 26 - pub local_timing_port: u16, 27 - stream_start: Option<Instant>, 28 - // kept alive so the OS port stays open; responder thread holds the other Arc 29 - _timing_sock: Arc<UdpSocket>, 22 + pub ssrc: u32, 23 + pub seqnum: u16, 30 24 } 31 25 32 - impl RtpSender { 33 - /// Bind all local UDP sockets. `connect_server()` must be called after SETUP 34 - /// once the server's ports are known. 35 - pub fn bind(ssrc: u32, initial_rtptime: u32) -> std::io::Result<Self> { 26 + impl ReceiverHandle { 27 + /// Bind local audio and ctrl sockets. Call `connect()` after SETUP. 28 + pub fn bind() -> std::io::Result<Self> { 36 29 let audio_sock = UdpSocket::bind("0.0.0.0:0")?; 37 30 let ctrl_sock = UdpSocket::bind("0.0.0.0:0")?; 38 31 let local_ctrl_port = ctrl_sock.local_addr()?.port(); 39 - let timing_sock = Arc::new(UdpSocket::bind("0.0.0.0:0")?); 40 - let local_timing_port = timing_sock.local_addr()?.port(); 41 - 42 - // Respond to NTP timing requests from the receiver so it can synchronise 43 - // and actually start playing. Without this, the timing port gets ICMP 44 - // unreachable replies and many receivers stall indefinitely. 45 - let timing_thread = Arc::clone(&timing_sock); 46 - thread::spawn(move || timing_responder(timing_thread)); 47 - 32 + let ssrc: u32 = rand::random(); 48 33 let server_ctrl_addr = "0.0.0.0:0".parse().unwrap(); 49 - tracing::debug!( 50 - "local ctrl_port={} timing_port={}", 51 - local_ctrl_port, 52 - local_timing_port 53 - ); 54 - 55 34 Ok(Self { 56 35 audio_sock, 57 36 ctrl_sock, 58 37 server_ctrl_addr, 38 + local_ctrl_port, 59 39 ssrc, 60 40 seqnum: 0, 61 - rtptime: initial_rtptime, 62 - initial_rtptime, 63 - frames_sent: 0, 64 - local_ctrl_port, 65 - local_timing_port, 66 - stream_start: None, 67 - _timing_sock: timing_sock, 68 41 }) 69 42 } 70 43 71 44 /// Connect the audio socket to the server's RTP port and record the ctrl addr. 72 - pub fn connect_server( 73 - &mut self, 74 - host: &str, 75 - audio_port: u16, 76 - ctrl_port: u16, 77 - ) -> std::io::Result<()> { 45 + pub fn connect(&mut self, host: &str, audio_port: u16, ctrl_port: u16) -> std::io::Result<()> { 78 46 tracing::debug!("connecting audio → {}:{}", host, audio_port); 79 47 self.audio_sock 80 48 .connect(format!("{}:{}", host, audio_port))?; ··· 84 52 Ok(()) 85 53 } 86 54 87 - pub fn send_audio(&mut self, alac_frame: &[u8; ALAC_FRAME_BYTES], first: bool) { 88 - let start = *self.stream_start.get_or_insert_with(Instant::now); 89 - 55 + /// Build and send one RTP audio packet. Increments seqnum. 56 + pub fn send_audio_packet( 57 + &mut self, 58 + alac_frame: &[u8; ALAC_FRAME_BYTES], 59 + rtptime: u32, 60 + frame_index: u64, 61 + first: bool, 62 + ) { 90 63 let mut pkt = [0u8; RTP_PACKET_BYTES]; 91 64 pkt[0] = 0x80; 92 65 pkt[1] = if first { 0x60 | 0x80 } else { 0x60 }; // M=1 on first, PT=96 93 66 pkt[2] = (self.seqnum >> 8) as u8; 94 67 pkt[3] = self.seqnum as u8; 95 - pkt[4] = (self.rtptime >> 24) as u8; 96 - pkt[5] = (self.rtptime >> 16) as u8; 97 - pkt[6] = (self.rtptime >> 8) as u8; 98 - pkt[7] = self.rtptime as u8; 68 + pkt[4] = (rtptime >> 24) as u8; 69 + pkt[5] = (rtptime >> 16) as u8; 70 + pkt[6] = (rtptime >> 8) as u8; 71 + pkt[7] = rtptime as u8; 99 72 pkt[8] = (self.ssrc >> 24) as u8; 100 73 pkt[9] = (self.ssrc >> 16) as u8; 101 74 pkt[10] = (self.ssrc >> 8) as u8; ··· 104 77 105 78 match self.audio_sock.send(&pkt) { 106 79 Ok(_) => { 107 - if self.frames_sent < 5 { 80 + if frame_index < 5 { 108 81 tracing::debug!( 109 82 "sent frame {} ts={} seq={} first={}", 110 - self.frames_sent, 111 - self.rtptime, 83 + frame_index, 84 + rtptime, 112 85 self.seqnum, 113 86 first 114 87 ); 115 88 } 116 89 } 117 - Err(e) => tracing::warn!("send error on frame {}: {}", self.frames_sent, e), 90 + Err(e) => tracing::warn!("send error on frame {}: {}", frame_index, e), 118 91 } 119 - 120 92 self.seqnum = self.seqnum.wrapping_add(1); 121 - self.rtptime = self.rtptime.wrapping_add(FRAME_SAMPLES as u32); 122 - self.frames_sent += 1; 123 - 124 - // RTCP NTP sync every ~44 frames (~0.35 s) 125 - if self.frames_sent % 44 == 0 { 126 - self.send_sync(false); 127 - } 128 - 129 - // Real-time pacing — sleep until the frame's playout deadline. 130 - let expected = start + Duration::from_micros(self.frames_sent * FRAME_DURATION_US); 131 - let now = Instant::now(); 132 - if expected > now { 133 - std::thread::sleep(expected - now); 134 - } 135 93 } 136 94 137 - fn send_sync(&self, first: bool) { 95 + /// Send an RTCP NTP sync packet on the ctrl socket. 96 + pub fn send_sync(&self, current_ts: u32, next_ts: u32, first: bool) { 138 97 let now = SystemTime::now() 139 98 .duration_since(UNIX_EPOCH) 140 99 .unwrap_or_default(); 141 100 let ntp_sec = now.as_secs() as u32 + NTP_EPOCH_DELTA; 142 101 let ntp_frac = ((now.subsec_nanos() as u64 * (1u64 << 32)) / 1_000_000_000) as u32; 143 - 144 - // "current" timestamp = frame we just sent (rtptime was already incremented) 145 - let current_ts = self.rtptime.wrapping_sub(FRAME_SAMPLES as u32); 146 - // "next" timestamp = self.rtptime (next frame to be sent) 147 - let next_ts = self.rtptime; 148 102 149 103 let mut pkt = [0u8; 20]; 150 104 pkt[0] = if first { 0x90 } else { 0x80 }; ··· 170 124 171 125 let _ = self.ctrl_sock.send_to(&pkt, self.server_ctrl_addr); 172 126 } 127 + } 173 128 174 - pub fn send_initial_sync(&self) { 175 - // At startup no frames have been sent yet; use initial_rtptime for both 176 - // "current" and "next" so we don't send a backwards-wrapped timestamp. 177 - let now = SystemTime::now() 178 - .duration_since(UNIX_EPOCH) 179 - .unwrap_or_default(); 180 - let ntp_sec = now.as_secs() as u32 + NTP_EPOCH_DELTA; 181 - let ntp_frac = ((now.subsec_nanos() as u64 * (1u64 << 32)) / 1_000_000_000) as u32; 182 - let ts = self.initial_rtptime; 129 + /// Shared NTP timing socket. One instance serves all receivers — the responder 130 + /// doesn't care which receiver the request came from, it just replies in place. 131 + pub struct TimingSocket { 132 + pub local_port: u16, 133 + // kept alive so the OS port stays open; responder thread holds the other Arc 134 + _sock: Arc<UdpSocket>, 135 + } 183 136 184 - let mut pkt = [0u8; 20]; 185 - pkt[0] = 0x90; // first sync: extension bit set 186 - pkt[1] = 0xd4; 187 - pkt[2] = 0x00; 188 - pkt[3] = 0x07; 189 - pkt[4] = (ts >> 24) as u8; 190 - pkt[5] = (ts >> 16) as u8; 191 - pkt[6] = (ts >> 8) as u8; 192 - pkt[7] = ts as u8; 193 - pkt[8] = (ntp_sec >> 24) as u8; 194 - pkt[9] = (ntp_sec >> 16) as u8; 195 - pkt[10] = (ntp_sec >> 8) as u8; 196 - pkt[11] = ntp_sec as u8; 197 - pkt[12] = (ntp_frac >> 24) as u8; 198 - pkt[13] = (ntp_frac >> 16) as u8; 199 - pkt[14] = (ntp_frac >> 8) as u8; 200 - pkt[15] = ntp_frac as u8; 201 - pkt[16] = (ts >> 24) as u8; 202 - pkt[17] = (ts >> 16) as u8; 203 - pkt[18] = (ts >> 8) as u8; 204 - pkt[19] = ts as u8; 137 + impl TimingSocket { 138 + pub fn bind() -> std::io::Result<Self> { 139 + let sock = Arc::new(UdpSocket::bind("0.0.0.0:0")?); 140 + let local_port = sock.local_addr()?.port(); 141 + let thread_sock = Arc::clone(&sock); 142 + thread::spawn(move || timing_responder(thread_sock)); 143 + tracing::debug!("timing responder bound on port {}", local_port); 144 + Ok(Self { 145 + local_port, 146 + _sock: sock, 147 + }) 148 + } 149 + } 205 150 206 - let _ = self.ctrl_sock.send_to(&pkt, self.server_ctrl_addr); 207 - tracing::debug!("sent initial sync ts={}", ts); 151 + /// Pacing state shared across all receivers. 152 + pub struct PacingClock { 153 + pub stream_start: Option<Instant>, 154 + pub frames_sent: u64, 155 + pub rtptime: u32, 156 + pub initial_rtptime: u32, 157 + } 158 + 159 + impl PacingClock { 160 + pub fn new(initial_rtptime: u32) -> Self { 161 + Self { 162 + stream_start: None, 163 + frames_sent: 0, 164 + rtptime: initial_rtptime, 165 + initial_rtptime, 166 + } 208 167 } 209 168 210 - pub fn reset_clock(&mut self) { 169 + /// Advance after sending one frame to all receivers. 170 + pub fn advance(&mut self) { 171 + self.rtptime = self.rtptime.wrapping_add(FRAME_SAMPLES as u32); 172 + self.frames_sent += 1; 173 + } 174 + 175 + /// Sleep until the current frame's real-time deadline. 176 + pub fn pace(&mut self) { 177 + let start = *self.stream_start.get_or_insert_with(Instant::now); 178 + let expected = start + Duration::from_micros(self.frames_sent * FRAME_DURATION_US); 179 + let now = Instant::now(); 180 + if expected > now { 181 + std::thread::sleep(expected - now); 182 + } 183 + } 184 + 185 + pub fn reset(&mut self) { 211 186 self.stream_start = None; 212 187 self.frames_sent = 0; 188 + self.rtptime = self.initial_rtptime; 213 189 } 214 190 } 215 191
+1
crates/rpc/src/lib.rs
··· 933 933 fifo_path: None, 934 934 airplay_host: None, 935 935 airplay_port: None, 936 + airplay_receivers: None, 936 937 squeezelite_http_port: None, 937 938 squeezelite_port: None, 938 939 }
+15 -3
crates/settings/src/lib.rs
··· 36 36 tracing::info!("audio output: fifo ({})", path); 37 37 } 38 38 Some("airplay") => { 39 - if let Some(ref host) = settings.airplay_host { 39 + pcm::airplay_clear_receivers(); 40 + // Multi-room list takes precedence over the legacy single-host fields. 41 + if let Some(ref receivers) = settings.airplay_receivers { 42 + if receivers.is_empty() { 43 + tracing::warn!("audio output: airplay_receivers is empty"); 44 + } 45 + for r in receivers { 46 + let port = r.port.unwrap_or(5000); 47 + pcm::airplay_add_receiver(&r.host, port); 48 + tracing::info!("audio output: airplay receiver {}:{}", r.host, port); 49 + } 50 + pcm::switch_sink(pcm::PCM_SINK_AIRPLAY); 51 + } else if let Some(ref host) = settings.airplay_host { 40 52 let port = settings.airplay_port.unwrap_or(5000); 41 53 pcm::airplay_set_host(host, port); 42 54 pcm::switch_sink(pcm::PCM_SINK_AIRPLAY); 43 - tracing::info!("audio output: airplay ({}:{})", host, port); 55 + tracing::info!("audio output: airplay {}:{}", host, port); 44 56 } else { 45 - tracing::warn!("audio output: airplay selected but airplay_host is not set"); 57 + tracing::warn!("audio output: airplay selected but no receiver configured"); 46 58 } 47 59 } 48 60 Some("squeezelite") => {
+2
crates/sys/src/lib.rs
··· 1149 1149 fn pcm_switch_sink(sink: c_int) -> c_uchar; 1150 1150 fn pcm_fifo_set_path(path: *const c_char); 1151 1151 fn pcm_airplay_set_host(host: *const c_char, port: c_ushort); 1152 + fn pcm_airplay_add_receiver(host: *const c_char, port: c_ushort); 1153 + fn pcm_airplay_clear_receivers(); 1152 1154 fn pcm_squeezelite_set_slim_port(port: c_ushort); 1153 1155 fn pcm_squeezelite_set_http_port(port: c_ushort); 1154 1156 fn beep_play(frequency: c_uint, duration: c_uint, amplitude: c_uint);
+10 -1
crates/sys/src/sound/pcm.rs
··· 54 54 } 55 55 56 56 pub fn airplay_set_host(host: &str, port: u16) { 57 - use std::ffi::CString; 58 57 let chost = CString::new(host).expect("host must not contain null bytes"); 59 58 unsafe { crate::pcm_airplay_set_host(chost.as_ptr(), port) } 60 59 std::mem::forget(chost); 60 + } 61 + 62 + pub fn airplay_add_receiver(host: &str, port: u16) { 63 + let chost = CString::new(host).expect("host must not contain null bytes"); 64 + unsafe { crate::pcm_airplay_add_receiver(chost.as_ptr(), port) } 65 + std::mem::forget(chost); 66 + } 67 + 68 + pub fn airplay_clear_receivers() { 69 + unsafe { crate::pcm_airplay_clear_receivers() } 61 70 } 62 71 63 72 pub fn fifo_set_path(path: &str) {
+14 -2
crates/sys/src/types/user_settings.rs
··· 647 647 } 648 648 } 649 649 650 + /// One entry in the `airplay_receivers` list in settings.toml. 651 + #[derive(Default, Debug, Clone, Serialize, Deserialize)] 652 + pub struct AirPlayReceiverConfig { 653 + pub host: String, 654 + /// RAOP port (default: 5000) 655 + pub port: Option<u16>, 656 + } 657 + 650 658 #[derive(Default, Debug, Clone, Serialize, Deserialize)] 651 659 pub struct NewGlobalSettings { 652 660 pub music_dir: Option<String>, ··· 681 689 pub audio_output: Option<String>, 682 690 /// Path for the FIFO sink, e.g. "/tmp/rockbox.fifo" or "-" for stdout 683 691 pub fifo_path: Option<String>, 684 - /// IP address or hostname of the AirPlay (RAOP) receiver 692 + /// Single AirPlay (RAOP) receiver — kept for backward compatibility. 693 + /// Prefer `airplay_receivers` for multi-room setups. 685 694 pub airplay_host: Option<String>, 686 - /// RAOP port on the receiver (default: 5000) 695 + /// RAOP port for the single receiver (default: 5000) 687 696 pub airplay_port: Option<u16>, 697 + /// Multi-room AirPlay receiver list. Takes precedence over `airplay_host`/`airplay_port`. 698 + pub airplay_receivers: Option<Vec<AirPlayReceiverConfig>>, 688 699 /// Slim Protocol control port for the squeezelite sink (default: 3483) 689 700 pub squeezelite_port: Option<u16>, 690 701 /// HTTP audio stream port for the squeezelite sink (default: 9999) ··· 726 737 fifo_path: None, 727 738 airplay_host: None, 728 739 airplay_port: None, 740 + airplay_receivers: None, 729 741 squeezelite_port: None, 730 742 squeezelite_http_port: None, 731 743 }
+2
firmware/target/hosted/pcm-airplay.c
··· 39 39 /* Rust C API — symbols are provided by the rockbox-airplay crate via 40 40 * librockbox_cli.a. */ 41 41 extern void pcm_airplay_set_host(const char *host, uint16_t port); 42 + extern void pcm_airplay_add_receiver(const char *host, uint16_t port); 43 + extern void pcm_airplay_clear_receivers(void); 42 44 extern int pcm_airplay_connect(void); 43 45 extern int pcm_airplay_write(const uint8_t *data, size_t len); 44 46 extern void pcm_airplay_stop(void);