···11+# Performance: Jetstream Event Fan-Out
22+33+## Current Architecture
44+55+Airglow maintains a **single WebSocket connection** to Jetstream, regardless of how many user subscriptions exist.
66+77+### How it works
88+99+1. **One WebSocket, deduplicated collections** — `JetstreamConsumer` is a singleton. On startup (and whenever subscriptions change), it loads all active subscriptions from the database and groups them by lexicon (collection). Only the **unique collection names** are sent as `wantedCollections` params to Jetstream. If 100 users subscribe to `app.bsky.feed.post`, Jetstream sends events for that collection once.
1010+1111+2. **In-memory fan-out** — When an event arrives, the consumer looks up all subscriptions for that collection in a `Map<string, Subscription[]>` and iterates through them, evaluating each subscription's conditions. Only matching subscriptions trigger their actions (webhook delivery or record creation).
1212+1313+3. **WebSocket reconnection on collection changes** — When a subscription is created or deleted, if the set of watched collections changes, the consumer closes and reopens the WebSocket with updated `wantedCollections` params. If only the subscriptions within an existing collection change, no reconnection is needed — the in-memory map is simply updated.
1414+1515+### Why this works well at current scale
1616+1717+- The `Map` lookup by collection is O(1).
1818+- Condition matching is a simple linear scan per subscription — fast for dozens or even hundreds of subscriptions per collection.
1919+- All fan-out happens in-process with no network overhead.
2020+- A single WebSocket minimizes Jetstream resource usage.
2121+2222+## Potential Future Improvements
2323+2424+As the number of subscriptions per collection grows (thousands+), the linear scan through conditions on every event could become a bottleneck. Some options:
2525+2626+- **Condition indexing** — Build inverted indexes on condition fields/values so only potentially matching subscriptions are evaluated, rather than scanning all of them.
2727+- **Batch/parallel condition evaluation** — Evaluate conditions for multiple subscriptions concurrently rather than sequentially.
2828+- **Sharded consumers** — Run multiple consumer instances, each responsible for a subset of collections or users, to distribute the fan-out load.
2929+- **Pre-filtering with Jetstream features** — If Jetstream adds more granular filtering (e.g. by DID or record fields), leverage that to reduce the volume of events the consumer needs to process.
3030+3131+None of these are needed today. The current design is simple, correct, and efficient for the expected scale.