···11+# Regenesis Watchman: Change-Driven Tree Coherence
22+33+> This document revisits watch/change architecture in light of the current staged tree work. It is not a copy of the original watchman plan; it focuses on coherence risks, change routing, and the registries we now operate.
44+55+## Problem
66+77+The crate now has a stronger staged model (plan -> resolve -> hydrate) and is moving toward mmap-first tree leaves. That improves ergonomics and memory behavior, but it also raises consistency challenges when storage changes while staged work is in progress.
88+99+We need a change architecture that guarantees:
1010+1111+1. Structural queries stay coherent with filesystem reality.
1212+2. Tree stages can detect when their inputs are stale.
1313+3. Reference leaves remain valid snapshots while allowing lazy reloads for subsequent requests.
1414+4. Consumers receive structured change signals, not opaque path strings.
1515+1616+## Current State
1717+1818+Current repositories/registries already present in code:
1919+2020+### Filesystem repositories (source of truth)
2121+2222+- Session files: `storage/session/<project>/<session>.json`
2323+- Message files: `storage/message/<session>/<message>.json`
2424+- Part files: `storage/part/<message>/<part>.json`
2525+- Session diffs: `storage/session_diff/<session>.json`
2626+2727+Path modeling is in [`/src/storage/paths.rs`](/src/storage/paths.rs).
2828+2929+### In-memory structural registries
3030+3131+`SessionIndex` in [`/src/index.rs`](/src/index.rs) currently stores:
3232+3333+- `session_metas`
3434+- `message_metas`
3535+- `part_refs`
3636+- `session_ids_by_project`
3737+- `message_ids_by_session`
3838+- `part_ids_by_message`
3939+4040+These are the primary relationship registries where structural changes must flow.
4141+4242+### In-memory mapping registry
4343+4444+`MappedFileCache` in [`/src/storage/mmap.rs`](/src/storage/mmap.rs) stores:
4545+4646+- `path -> Arc<MappedFile>`
4747+4848+This is the byte-level registry for payload access and snapshot semantics.
4949+5050+### Flow-stage registries (ephemeral)
5151+5252+Flow decomposition in [`/src/materializer.rs`](/src/materializer.rs) adds staged containers:
5353+5454+- `SessionFlowScope`
5555+- `MessageFlowScope`
5656+- `SessionFlowResult`
5757+5858+These are per-request repositories and can become stale if changes happen mid-flow.
5959+6060+## Key Concerns Introduced by Watch-Driven Updates
6161+6262+### 1) Multi-registry coherence
6363+6464+A single file change can affect multiple registries (for example, deleting a session affects session map, reverse indexes, message maps, part refs, mmap cache entries).
6565+6666+Concern:
6767+6868+- Partial apply can leave index and cache inconsistent.
6969+7070+### 2) Stage staleness during plan -> resolve -> hydrate
7171+7272+A flow plan can be generated at generation `g`, but resolve/hydrate may run after `g+1` changes arrive.
7373+7474+Concern:
7575+7676+- planned IDs may refer to deleted or replaced entities.
7777+7878+### 3) Path-key vs entity-key drift
7979+8080+Watchman emits paths; tree APIs use entity IDs and relationships.
8181+8282+Concern:
8383+8484+- path-only invalidation is insufficient for tree-level coherence and CDC semantics.
8585+8686+### 4) `is_fresh_instance` and continuity break
8787+8888+When Watchman loses continuity, incremental guarantees are gone.
8989+9090+Concern:
9191+9292+- all registries need synchronized recovery policy, not ad-hoc partial clears.
9393+9494+### 5) Backpressure and event loss behavior
9595+9696+Burst writes can overrun in-process queues.
9797+9898+Concern:
9999+100100+- if drops occur, consumers need explicit resync signaling and cursor semantics.
101101+102102+### 6) Mmap lifetime versus "latest" view
103103+104104+`Arc<MappedFile>` intentionally provides snapshot behavior for in-flight users.
105105+106106+Concern:
107107+108108+- without generation metadata, callers cannot know whether a given leaf is latest or stale.
109109+110110+### 7) Lock contention under hot write streams
111111+112112+Naive write-heavy index updates can block read paths.
113113+114114+Concern:
115115+116116+- staged tree APIs lose responsiveness under high change rates.
117117+118118+## Registries/Repositories Where Changes Should Flow
119119+120120+For this architecture, change propagation should explicitly target these repositories in order:
121121+122122+1. **Change ingest repository**
123123+ - raw file changes from watch backend
124124+ - canonicalized `FileChange` records
125125+126126+2. **Classification repository**
127127+ - `FileChange -> EntityKey + ChangeOp`
128128+ - path and identity joined into one event
129129+130130+3. **Structural registry repository** (SessionIndex maps)
131131+ - upsert/remove metadata and reverse indexes
132132+133133+4. **Mapping registry repository** (MappedFileCache)
134134+ - evict or mark stale path mappings
135135+136136+5. **Generation registry repository**
137137+ - `generation`, `entity_dirty`, `path_dirty`, and per-batch cursor sequence
138138+139139+6. **Flow coherence repository**
140140+ - plan/resolve/hydrate generation stamps and stale checks
141141+142142+7. **CDC/event repository**
143143+ - low-level `CdcEvent`
144144+ - session-level `SessionUpdate`
145145+146146+8. **Observability repository**
147147+ - metrics, traces, lag, drop counters
148148+149149+## Draft Architecture (Tree-Aware)
150150+151151+```mermaid
152152+flowchart TD
153153+ Feed[Watchman/notify/manual feed] --> Ingest[FileChange ingest repository]
154154+ Ingest --> Classify[Entity classifier repository]
155155+ Classify --> ApplyIndex[SessionIndex apply repository]
156156+ Classify --> ApplyCache[MappedFileCache apply repository]
157157+ Classify --> ApplyGen[Generation registry]
158158+159159+ ApplyIndex --> Plan[Flow planner]
160160+ ApplyGen --> Plan
161161+ Plan --> Resolve[Ref-tree resolver]
162162+ ApplyGen --> Resolve
163163+ Resolve --> Hydrate[Optional hydrator]
164164+165165+ Classify --> CDC[CdcEvent repository]
166166+ CDC --> SessionUpdates[SessionUpdate repository]
167167+ CDC --> Metrics[Observability repository]
168168+```
169169+170170+## Key Design Choices
171171+172172+### 1) Single-writer apply loop
173173+174174+Decision:
175175+176176+- One writer task applies all structural/cache/generation updates in ordered batches.
177177+178178+Why:
179179+180180+- Ensures deterministic registry mutation order.
181181+- Avoids interleaving bugs across related maps.
182182+183183+### 2) Generation as flow contract, not just cache invalidation
184184+185185+Decision:
186186+187187+- Every plan/resolve/hydrate stage carries generation bounds.
188188+189189+Why:
190190+191191+- lets staged APIs detect stale plans and either retry or return typed stale errors.
192192+193193+### 3) Entity-first classification
194194+195195+Decision:
196196+197197+- Convert paths into `EntityKey` early and keep both path + key in events.
198198+199199+Why:
200200+201201+- trees and CDC operate on identity keys; cache invalidation still needs paths.
202202+203203+### 4) Explicit resync protocol
204204+205205+Decision:
206206+207207+- on continuity loss or queue overflow, emit `ResyncStarted/ResyncCompleted` and rebuild structural registries in one controlled pass.
208208+209209+Why:
210210+211211+- prevents silent divergence between local registries and filesystem truth.
212212+213213+### 5) Snapshot-friendly reference leaves
214214+215215+Decision:
216216+217217+- reference leaves remain valid for in-flight reads even when newer generations exist.
218218+219219+Why:
220220+221221+- preserves safe snapshot semantics while allowing next reads to observe fresh data.
222222+223223+## Tree-Specific Coherence Rules
224224+225225+1. `SessionFlowScope` includes `planned_generation`.
226226+2. Resolve verifies `current_generation >= planned_generation` and checks per-entity dirtiness.
227227+3. Hydrate verifies leaf freshness by entity/path generation before parse.
228228+4. If stale, return typed stale result or internally retry from re-plan policy.
229229+230230+## Draft Types to Add
231231+232232+```rust
233233+pub struct FlowGenerationGuard {
234234+ pub planned_generation: u64,
235235+ pub resolved_generation: u64,
236236+}
237237+238238+pub enum FlowStaleness {
239239+ Fresh,
240240+ StaleEntity { key: EntityKey, dirty_generation: u64 },
241241+ StalePath { path: PathBuf, dirty_generation: u64 },
242242+}
243243+244244+pub struct ApplyBatchReport {
245245+ pub generation: u64,
246246+ pub applied_events: usize,
247247+ pub structural_updates: usize,
248248+ pub cache_updates: usize,
249249+}
250250+```
251251+252252+## Risks and Mitigations
253253+254254+Risk: Registry fan-out complexity grows quickly.
255255+256256+- Mitigation: strict apply pipeline and shared `ApplyContext` used by all mutation handlers.
257257+258258+Risk: Over-invalidation reduces cache efficiency.
259259+260260+- Mitigation: track both entity and path scopes; invalidate minimally.
261261+262262+Risk: Recovery storms on repeated backend disruptions.
263263+264264+- Mitigation: coalesce resync triggers and debounce full rebuilds.
265265+266266+Risk: Feature incompatibility in watch backend dependencies.
267267+268268+- Mitigation: keep `ChangeFeed` abstraction and test manual/notify feeds as parity backstops.
269269+270270+## Acceptance Criteria
271271+272272+1. All registry updates for one change batch are applied atomically in one writer loop.
273273+2. Tree stage outputs carry generation context and can report staleness.
274274+3. Path and entity invalidation are both represented and test-covered.
275275+4. Resync protocol is explicit and observable.
276276+5. CDC streams are emitted from the same canonical apply path used by index/cache updates.
277277+278278+## What This Does Not Require
279279+280280+- It does not require implementing the exact original watchman plan.
281281+- It does not require immediate durable event log persistence.
282282+- It does require registry coherence and tree-stage correctness regardless of backend.