atproto relay implementation in zig
zlay.waow.tech
1# zig stdlib patches for the Evented backend
2
3zlay runs on `Io.Evented` (io_uring fiber scheduler) for network I/O. the
4upstream zig 0.16-dev stdlib (`0.16.0-dev.3059+42e33db9d`) ships several
5Uring networking operations as stubs that return `error.NetworkDown`. zlay
6patches these at build time and works around other stdlib limitations.
7
8this document tracks what we had to change, why, and what upstream work
9would let us drop each workaround.
10
11## patch 1: Uring networking (`patches/uring-networking.patch`)
12
13**applied in**: `Dockerfile` line 13, patches `lib/std/Io/Uring.zig`
14
15the upstream stdlib has these functions stubbed as `*Unavailable`:
16
17```
18netListenIpUnavailable → return error.NetworkDown
19netAcceptUnavailable → return error.NetworkDown
20netConnectIpUnavailable → return error.NetworkDown
21netSendUnavailable → return error.NetworkDown
22netReadUnavailable → return error.NetworkDown
23netWriteUnavailable → return error.NetworkDown
24```
25
26without these, `Io.Evented` can init but any TCP operation fails immediately.
27the patch replaces all six with working implementations:
28
29| function | io_uring opcode | notes |
30|---|---|---|
31| `netListenIp` | sync `bind()` + `listen()` | IORING_OP_BIND/LISTEN need kernel 6.11+, so we use sync syscalls |
32| `netAccept` | `IORING_OP_ACCEPT` | fiber yields until connection arrives |
33| `netConnectIp` | `IORING_OP_CONNECT` | + socket creation via existing `ev.socket()` |
34| `netSend` | `IORING_OP_SENDMSG` | iterates message array, one SENDMSG per message |
35| `netRead` | `IORING_OP_READV` or `IORING_OP_READ` | scatter read; single-buffer fast path |
36| `netWrite` | `IORING_OP_SENDMSG` | gather write with iovec assembly + splat pattern handling |
37
38the patch also adds two helpers:
39- `connect()` — submits `IORING_OP_CONNECT` SQE, handles retry on `EINTR`/`ECANCELED`
40- `netSendOne()` — sends a single `OutgoingMessage` via `IORING_OP_SENDMSG`
41
42### why sync bind/listen
43
44`IORING_OP_BIND` and `IORING_OP_LISTEN` were added in linux 6.11. production
45runs on bookworm (kernel 6.1). `bind()` and `listen()` are fast synchronous
46calls anyway — no benefit from async submission. the rest of the networking
47stack (accept, connect, read, write) uses proper io_uring async ops.
48
49### why not upstream yet
50
51tracked as zig issue #31723. the Uring networking layer is under active
52development. our patch makes pragmatic choices (sync bind/listen, specific
53error mappings) that may not match upstream's desired API shape. we'd want
54to align with whatever design decisions the zig team makes before submitting.
55
56### regenerating the patch
57
58the patch is pinned to zig `0.16.0-dev.3059+42e33db9d`. any zig version
59bump requires checking if the Uring.zig source changed and regenerating.
60
61```bash
62# to regenerate after a zig update:
63diff -u /path/to/old-zig/lib/std/Io/Uring.zig /path/to/patched/Uring.zig > patches/uring-networking.patch
64```
65
66## workaround 2: DNS resolution via Threaded fallback
67
68**not patched** — worked around in application code.
69
70`Io.Uring` does not implement `netLookup` (DNS resolution). instead of
71patching it, subscribers route DNS through `pool_io` (Threaded):
72
73```zig
74// subscriber.zig:326-330
75// DNS + TCP connect through pool_io (Threaded — has working netLookup).
76const dns_io = self.pool_io orelse self.io;
77const net_stream = try host_name.connect(dns_io, 443, .{ .mode = .stream });
78```
79
80this works because `Io.Threaded.netLookup` uses `getaddrinfo` on a worker
81thread. the resulting socket handle is then used with Evented I/O for
82the actual data transfer (reads/writes go through the patched Uring ops).
83
84**upstream fix**: implement `netLookup` in Uring.zig, probably by submitting
85the blocking `getaddrinfo` call on an io_uring worker thread
86(`IORING_OP_POLL_ADD` + thread pool, or the newer `IORING_OP_GETXATTR`
87pattern). not blocking — the Threaded fallback is fine.
88
89## workaround 3: ReleaseSafe GPF in Uring fiber context
90
91**not patched** — worked around by building with `ReleaseFast`.
92
93```dockerfile
94# Dockerfile line 21-23
95# ReleaseFast (not ReleaseSafe): Io.Uring fiber context-switch GPFs under ReleaseSafe
96RUN zig build -Doptimize=ReleaseFast ...
97```
98
99under `ReleaseSafe`, the optimizer's inlining interacts badly with Uring's
100fiber context-switch machinery. the result is a general protection fault
101during normal fiber yield/resume. `ReleaseFast` and `Debug` both work.
102`scripts/repro_evented.zig` reproduces this — three simple fiber
103tests (no-sleep, yield, sleep) that pass under Debug and ReleaseFast but
104GPF under ReleaseSafe.
105
106this is likely a zig codegen/optimizer bug. we haven't filed it yet because
107the reproduction is minimal but the root cause analysis is incomplete —
108could be a safety check reading stale fiber stack, or an inlining decision
109that breaks the stack-swap assumptions.
110
111**upstream fix**: file a bug with the repro. probably a zig compiler issue,
112not an Uring.zig issue.
113
114## workaround 4: `std_options.debug_io` single-threaded default
115
116**not patched** — worked around in `src/main.zig`.
117
118```zig
119// main.zig:62-64
120var debug_threaded_io: Io.Threaded = undefined;
121pub const std_options_debug_threaded_io: ?*Io.Threaded = &debug_threaded_io;
122```
123
124`std.debug.print` internally uses an `Io`-managed lock for output
125serialization. the default (`debug_io = null`) assumes single-threaded
126execution. zlay has multiple OS threads (frame worker pool, GC thread,
127resyncer thread) that all call `std.debug.print` / `log.*`. without this
128override, concurrent debug prints corrupt each other or deadlock.
129
130**upstream fix**: arguably the default should be safe for multi-threaded
131programs. but explicit opt-in is reasonable — it requires initializing an
132`Io.Threaded` instance at startup which has a cost.
133
134## workaround 5: `Io.Event.reset()` single-waiter assumption
135
136**not patched** — worked around in pg.zig fork.
137
138`Io.Event` has a `reset()` method with a stdlib invariant (Io.zig:1857):
139it assumes no pending call to `wait`. when multiple threads contend for
140a pooled resource (pg.Pool connections), `set()` wakes all waiters, one
141calls `reset()`, and the others hit `unreachable`.
142
143the pg.zig fork (`5ce2355`, dev branch) replaced `Io.Event` with a
144monotonic `u32` futex counter:
145- `release()` increments counter + `futexWake(1)` (wake one)
146- `acquire()` snapshots counter under mutex + `futexWaitTimeout()` with snapshot
147- no `reset()`, no single-waiter constraint
148
149**upstream fix**: `Io.Event` could support multi-waiter reset, or provide a
150semaphore/condvar primitive. the futex counter pattern is well-known and
151could be upstreamed to pg.zig proper.
152
153## workaround 6: cross-Io Mutex/futex incompatibility
154
155**not patched** — worked around by careful Io segregation.
156
157`Io.Mutex` and `Io.Condition` use futex operations that are tied to their
158Io backend. calling `mutex.lockUncancelable(threaded_io)` from an Evented
159fiber dereferences `Thread.current()` — a threadlocal only set on Uring-
160managed threads. on Evented fibers it's NULL → SIGSEGV or heap corruption.
161
162this caused three separate crashes during the migration (crashes 1, 6, 8 in
163docs/notes.md). the fix pattern is always the same: components that use Threaded
164resources (mutexes initialized with `pool_io`, pg.Pool) must run on plain
165`std.Thread`, not as Evented `io.concurrent()` fibers.
166
167current segregation:
168
169| component | runs on | why |
170|---|---|---|
171| GC loop | `std.Thread` + `pool_io` | uses DiskPersist mutex + pg.Pool |
172| resyncer | `std.Thread` + `pool_io` | uses DiskPersist + HTTP client |
173| frame workers | `std.Thread` + `pool_io` | uses Io.Mutex/Condition for queue sync |
174| subscribers | `io.concurrent` (Evented) | pure network I/O, no shared mutexes |
175| broadcast loop | `io.concurrent` (Evented) | lock-free ring buffer + atomics |
176| health checks | Evented handlers | use atomic `last_db_success`, not pg.Pool |
177
178**upstream fix**: there's no obvious stdlib fix here — this is architectural.
179either Mutex/Condition need to detect and handle cross-Io calls, or the docs
180need to clearly state the constraint. a `pg.Pool` that accepts an `Io`
181parameter per-call (rather than at init) would also help.
182
183## summary table
184
185| # | issue | fix type | status | drops when |
186|---|---|---|---|---|
187| 1 | Uring networking stubs | patch | `patches/uring-networking.patch` | upstream implements (zig#31723) |
188| 2 | DNS resolution missing | app workaround | Threaded fallback in subscriber | upstream implements netLookup |
189| 3 | ReleaseSafe GPF | build flag | `-Doptimize=ReleaseFast` | upstream fixes codegen bug |
190| 4 | debug_io single-threaded | app workaround | `std_options_debug_threaded_io` | upstream changes default or n/a |
191| 5 | Io.Event single-waiter | dep fork | pg.zig futex counter | upstream adds multi-waiter Event |
192| 6 | cross-Io Mutex | app architecture | Io segregation | upstream makes Mutex cross-Io safe |