···11+{"id":"ocs-bon-builder","title":"Introduce bon::Builder across key types","description":"Replace manual constructor patterns with `bon::Builder` derive to reduce API surface, improve ergonomics, and make future configuration options additive without proliferating `with_*` variants.\n\nKey goals:\n- Unify mutually exclusive construction modes into fluent builder APIs\n- Reduce invalid state combinations for structs with many related fields\n- Make test fixtures and synthetic data creation cleaner\n- Establish a consistent pattern for configuration-heavy types","status":"open","priority":2,"issue_type":"epic","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-23T18:52:23.254842635-05:00","created_by":"rektide de la faye","updated_at":"2026-02-23T18:52:23.254842635-05:00"}
22+{"id":"ocs-bon-domain-structs","title":"Add bon::Builder to large serde domain structs","description":"Add builders to message/part domain types with many optional fields to simplify test fixtures and future synthetic data generation.\n\nCandidates:\n- `AssistantMessage` (`src/types/message.rs:63-88`) - 15 fields, many optional\n- `UserMessage` (`src/types/message.rs:30-45`) - 9 fields, many optional\n- `ToolStateCompleted` (`src/types/part.rs:128-137`) - 6 fields\n- `SubtaskPart` (`src/types/part.rs:254-264`) - 7 fields\n\nThese are serde types, so builder is primarily for programmatic construction (tests, fixtures).\n\nAcceptance:\n- Selected structs derive `bon::Builder`\n- At least one test demonstrates cleaner fixture construction\n- Serde behavior unchanged","status":"open","priority":2,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-23T18:53:17.020990858-05:00","created_by":"rektide de la faye","updated_at":"2026-02-23T18:53:17.020990858-05:00","dependencies":[{"issue_id":"ocs-bon-domain-structs","depends_on_id":"ocs-bon-builder","type":"parent-child","created_at":"2026-02-23T18:56:40.272222113-05:00","created_by":"rektide de la faye"}]}
33+{"id":"ocs-bon-index-meta","title":"Add bon::Builder to index metadata structs","description":"Add builders to `SessionMeta`, `MessageMeta`, and `PartRef` to improve hot-path construction clarity.\n\nLocations:\n- `SessionMeta` literal at `src/index.rs:295-305`\n- `MessageMeta` literal at `src/index.rs:335-343`\n- `PartRef::new()` constructor at `src/index.rs:51-64`\n\nTarget:\n```rust\nSessionMeta::builder()\n .id(session_id.clone())\n .title(session.title)\n .created(session.time.created)\n .updated(session.time.updated)\n .project_id(project_id.to_string())\n .message_count(message_count)\n .build()\n```\n\nAcceptance:\n- All three structs derive `bon::Builder`\n- Inline literals replaced with builder calls\n- Tests pass","status":"open","priority":2,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-23T18:53:13.369488571-05:00","created_by":"rektide de la faye","updated_at":"2026-02-23T18:53:13.369488571-05:00","dependencies":[{"issue_id":"ocs-bon-index-meta","depends_on_id":"ocs-bon-builder","type":"parent-child","created_at":"2026-02-23T18:56:39.554905836-05:00","created_by":"rektide de la faye"}]}
44+{"id":"ocs-bon-materializer","title":"Add bon::Builder to SessionMaterializer","description":"Replace `new()`, `detect()`, `with_paths()`, `with_reader()` constructors with a unified builder pattern.\n\nCurrent API (`src/materializer.rs:16-33`):\n- `SessionMaterializer::new()` - auto-detect paths\n- `SessionMaterializer::detect()` - same as new\n- `SessionMaterializer::with_paths(StoragePaths)` - explicit paths\n- `SessionMaterializer::with_reader(FileReader)` - custom reader\n\nTarget API:\n```rust\nlet m = SessionMaterializer::builder()\n .paths(storage_paths) // or .detect_paths() or .reader(file_reader)\n .build()?;\n```\n\nAcceptance:\n- `SessionMaterializer` derives `bon::Builder`\n- Existing constructors deprecated or removed\n- `SessionLoader` similarly updated for consistency\n- Tests pass","status":"open","priority":1,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-23T18:53:06.117888779-05:00","created_by":"rektide de la faye","updated_at":"2026-02-23T18:53:06.117888779-05:00","dependencies":[{"issue_id":"ocs-bon-materializer","depends_on_id":"ocs-bon-builder","type":"parent-child","created_at":"2026-02-23T18:56:38.449090839-05:00","created_by":"rektide de la faye"}]}
55+{"id":"ocs-bon-storage-paths","title":"Add bon::Builder to StoragePaths","description":"Replace manual struct assembly with builder pattern for safer path configuration.\n\nCurrent state (`src/storage/paths.rs:5-13`):\n- 7 related fields that must stay consistent\n- `detect()` and `from_base()` constructors\n- Test in `src/storage/paths.rs:141` uses literal struct assembly\n\nTarget:\n```rust\nlet paths = StoragePaths::builder()\n .base(path_buf)\n .build()?;\n\n// Or for tests:\nlet paths = StoragePaths::builder()\n .root(\"/test/storage\")\n .session(\"/test/storage/session\")\n .message(\"/test/storage/message\")\n .part(\"/test/storage/part\")\n .diff(\"/test/storage/session_diff\")\n .snapshot(\"/test/storage/snapshot\")\n .migration(\"/test/storage/migration\")\n .build();\n```\n\nAcceptance:\n- `StoragePaths` derives `bon::Builder`\n- `detect()` and `from_base()` remain as convenience methods\n- Test updated to use builder","status":"open","priority":1,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-23T18:53:09.833679196-05:00","created_by":"rektide de la faye","updated_at":"2026-02-23T18:53:09.833679196-05:00","dependencies":[{"issue_id":"ocs-bon-storage-paths","depends_on_id":"ocs-bon-builder","type":"parent-child","created_at":"2026-02-23T18:56:38.932992393-05:00","created_by":"rektide de la faye"}]}
16{"id":"ocs-core-err","title":"Error Type and Result Alias","description":"Define unified Error enum with all variants and Result\u003cT\u003e alias","status":"closed","priority":1,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:07:43.184485341-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:08:59.153252492-05:00","closed_at":"2026-02-18T18:08:59.153252492-05:00","close_reason":"Completed"}
27{"id":"ocs-core-id","title":"ID Types with Timestamp Extraction","description":"Define typed identifiers for sessions, messages, and parts with timestamp extraction, parse/display/debug traits, and serde support","status":"closed","priority":1,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:07:40.590649889-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:08:58.530206256-05:00","closed_at":"2026-02-18T18:08:58.530206256-05:00","close_reason":"Completed"}
38{"id":"ocs-core-type","title":"Core Data Types","description":"Define Rust structs matching opencode schemas: SessionInfo, Message (User/Assistant), Part (12 variants), with serde support","status":"closed","priority":1,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:07:42.004452213-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:08:58.827551611-05:00","closed_at":"2026-02-18T18:08:58.827551611-05:00","close_reason":"Completed"}
···813{"id":"ocs-load-sess","title":"Session Loader","description":"Load a complete session with all metadata and associated diff file","status":"closed","priority":2,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:09:10.541800693-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:11:53.49490446-05:00","closed_at":"2026-02-18T18:11:53.49490446-05:00","close_reason":"Completed"}
914{"id":"ocs-mat-query","title":"Query Interface","description":"High-level query API: session tree, time filtering, project filtering, relationship navigation","status":"closed","priority":2,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:12:06.668313077-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:18:57.808443063-05:00","closed_at":"2026-02-18T18:18:57.808443063-05:00","close_reason":"Completed"}
1015{"id":"ocs-mat-sess","title":"Session Materializer","description":"SessionMaterializer with index-based lookups and lazy content loading via mmap","status":"closed","priority":2,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:12:04.965159412-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:18:57.59091873-05:00","closed_at":"2026-02-18T18:18:57.59091873-05:00","close_reason":"Completed"}
1616+{"id":"ocs-optimize-append-indexing","title":"Optimize update-read path to index only newly appended content","description":"Investigate what content is typically appended in update operations and optimize the indexing to only process newly appended content rather than re-indexing entire files/structures.","status":"open","priority":2,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-20T03:14:29.630420381-05:00","created_by":"rektide de la faye","updated_at":"2026-02-20T03:14:29.630420381-05:00"}
1117{"id":"ocs-stor-mmap","title":"Memory-Mapped File Wrapper","description":"Create safe wrapper around memmap2 with caching for shared ownership","status":"closed","priority":1,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:07:45.445449396-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:09:00.022211385-05:00","closed_at":"2026-02-18T18:09:00.022211385-05:00","close_reason":"Completed"}
1218{"id":"ocs-stor-path","title":"Storage Path Resolution","description":"Implement XDG-compliant path discovery with path builders for each entity type","status":"closed","priority":1,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:07:44.237866599-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:08:59.580357656-05:00","closed_at":"2026-02-18T18:08:59.580357656-05:00","close_reason":"Completed"}
1319{"id":"ocs-stor-read","title":"File Reader with JSON Parsing","description":"Read and parse JSON files for each entity type with mmap caching","status":"closed","priority":1,"issue_type":"task","owner":"rektide+git@voodoowarez.com","created_at":"2026-02-18T18:07:46.577111181-05:00","created_by":"rektide de la faye","updated_at":"2026-02-18T18:09:00.301894724-05:00","closed_at":"2026-02-18T18:09:00.301894724-05:00","close_reason":"Completed"}
···11+# Regenesis Tree: Structured Session Graph APIs
22+33+> This document defines the next tree API shape for `opencode-session-rs`, including problem framing, current state, and draft plans with explicit design decisions.
44+55+## Problem
66+77+Consumers need two things at once:
88+99+1. A predictable graph model for navigating sessions -> messages -> parts.
1010+2. Control over execution stages, so they can stop early (metadata only) or continue to hydrated payloads.
1111+1212+Historically, tree loading tended to jump from IDs directly to fully hydrated structures, which couples traversal and parsing too tightly.
1313+1414+## Current State
1515+1616+Current modules and capabilities:
1717+1818+- Index graph and builder in [`/src/index.rs`](/src/index.rs)
1919+- Staged materializer methods in [`/src/materializer.rs`](/src/materializer.rs)
2020+- Loader wrappers in [`/src/loader.rs`](/src/loader.rs)
2121+- Reader/listing primitives in [`/src/storage/reader.rs`](/src/storage/reader.rs)
2222+- Public exports in [`/src/lib.rs`](/src/lib.rs)
2323+2424+Recent progress:
2525+2626+- Flow decomposition methods exist (`plan_*`, `run_*`) and are configurable with Bon builder options.
2727+- Tree assembly is more modular than before.
2828+2929+Remaining gaps:
3030+3131+1. Tree API contracts are still mixed between reference and hydrated output semantics.
3232+2. Projection choices are not explicit enough for consumer intent.
3333+3. Reporting/diagnostics for partial graph construction is under-specified.
3434+3535+## Target Tree Model
3636+3737+Use a three-shape model with explicit boundaries:
3838+3939+1. **Structure Tree**: IDs + relationships only.
4040+2. **Reference Tree**: structure + mapped spans for payload leaves.
4141+3. **Hydrated Tree**: reference tree plus parsed objects.
4242+4343+```mermaid
4444+flowchart LR
4545+ Index[SessionIndex] --> StructureTree[StructureTree]
4646+ StructureTree --> ReferenceTree[ReferenceTree]
4747+ ReferenceTree --> HydratedTree[HydratedTree]
4848+```
4949+5050+### Why three shapes?
5151+5252+- Structure tree is cheapest for navigation/search/filter.
5353+- Reference tree is the zero-copy contract boundary.
5454+- Hydrated tree is convenience for clients that want full structs.
5555+5656+## Draft Types
5757+5858+### Structure level
5959+6060+```rust
6161+pub struct SessionNode {
6262+ pub session_id: SessionId,
6363+ pub project_id: String,
6464+ pub message_ids: Vec<MessageId>,
6565+}
6666+6767+pub struct MessageNode {
6868+ pub message_id: MessageId,
6969+ pub session_id: SessionId,
7070+ pub part_ids: Vec<PartId>,
7171+}
7272+```
7373+7474+### Reference level
7575+7676+```rust
7777+pub struct SessionRefNode {
7878+ pub key: SessionKey,
7979+ pub span: MappedSpan,
8080+ pub messages: Vec<MessageRefNode>,
8181+}
8282+8383+pub struct MessageRefNode {
8484+ pub key: MessageKey,
8585+ pub span: MappedSpan,
8686+ pub parts: Vec<PartRefNode>,
8787+}
8888+8989+pub struct PartRefNode {
9090+ pub key: PartKey,
9191+ pub kind: PartKind,
9292+ pub span: MappedSpan,
9393+}
9494+```
9595+9696+### Hydrated level
9797+9898+```rust
9999+pub struct SessionHydratedNode {
100100+ pub info: SessionInfo,
101101+ pub messages: Vec<MessageHydratedNode>,
102102+}
103103+104104+pub struct MessageHydratedNode {
105105+ pub message: Message,
106106+ pub parts: Vec<Part>,
107107+}
108108+```
109109+110110+## Draft Flow Pipeline
111111+112112+### Stage 1: Plan
113113+114114+Build deterministic scopes from options and index.
115115+116116+- input: flow options
117117+- output: planned session/message/part IDs
118118+119119+### Stage 2: Resolve references
120120+121121+Convert planned IDs into mapped spans.
122122+123123+- input: scopes
124124+- output: reference tree
125125+126126+### Stage 3: Optional hydrate
127127+128128+Parse selected references into typed structs.
129129+130130+- input: reference tree
131131+- output: hydrated tree or partial hydrated projection
132132+133133+## Key Design Choices
134134+135135+### 1) IDs remain the graph spine
136136+137137+Decision:
138138+139139+- All internal maps and joins remain keyed by typed IDs.
140140+141141+Why:
142142+143143+- Matches opencode storage relationships.
144144+- Keeps tree planning independent from payload parse cost.
145145+146146+### 2) Deterministic ordering is mandatory
147147+148148+Decision:
149149+150150+- Externally visible tree vectors are ordered consistently:
151151+ - sessions: descending session ID
152152+ - messages: ascending message ID
153153+ - parts: ascending part ID
154154+155155+Why:
156156+157157+- Stable behavior for caches, pagination, and tests.
158158+159159+### 3) Stage outputs are first-class types
160160+161161+Decision:
162162+163163+- Do not collapse plan/resolve/hydrate into one opaque return type.
164164+165165+Why:
166166+167167+- Consumers can opt into only the stage they need.
168168+- Easier observability and diagnostics per stage.
169169+170170+### 4) Flow options use Bon for ergonomics
171171+172172+Decision:
173173+174174+- Continue using Bon builder for flow configuration.
175175+176176+Why:
177177+178178+- Keeps options explicit and discoverable.
179179+- Avoids telescoping constructors as options grow.
180180+181181+### 5) Tree assembly should not hide errors
182182+183183+Decision:
184184+185185+- Return typed reports for skipped nodes and failed hydrations.
186186+187187+Why:
188188+189189+- Consumers need policy control (strict/fail-fast vs tolerant/partial).
190190+191191+## Consumer-Facing API Draft
192192+193193+Potential high-level methods on [`/src/materializer.rs`](/src/materializer.rs):
194194+195195+- `plan_session_tree(options) -> SessionPlan`
196196+- `resolve_session_tree(plan) -> SessionRefTree`
197197+- `hydrate_session_tree(ref_tree) -> SessionHydratedTree`
198198+- `hydrate_message_nodes(ref_tree, ids) -> PartialHydratedMessages`
199199+200200+This naming makes stage boundaries explicit and testable.
201201+202202+## Diagnostics and Reports
203203+204204+Add reports for each stage:
205205+206206+- `PlanReport` (counts, truncated by limits)
207207+- `ResolveReport` (missing files, invalid references)
208208+- `HydrateReport` (parse failures by key/path)
209209+210210+Each report should include:
211211+212212+- stage name
213213+- total attempted
214214+- total succeeded
215215+- skipped/failure entries with typed reason
216216+217217+## Acceptance Criteria
218218+219219+1. Tree APIs expose plan/resolve/hydrate as separate public methods.
220220+2. Reference tree leaves are mmap spans (no implicit parse in resolve stage).
221221+3. Hydrated helpers are wrappers over resolve + hydrate stages.
222222+4. Deterministic ordering is enforced and tested.
223223+5. Stage reports are available for strict and tolerant consumer policies.
224224+225225+## Migration Strategy
226226+227227+1. Introduce new stage types/methods in parallel with existing convenience methods.
228228+2. Re-implement convenience methods on top of staged APIs.
229229+3. Mark old ambiguous methods for cleanup in next breaking pass.
230230+4. Update [`/README.md`](/README.md) examples to prefer staged flow.
231231+232232+## Out of Scope for This Tree Pass
233233+234234+- Durable CDC replay log.
235235+- Full watch backend implementation.
236236+- Rustdoc-heavy narrative documentation (to follow after contracts settle).
+219
doc/discovery/regenesis-zerocopy.md
···11+# Regenesis Zero-Copy: Mmap-First Tree Leaves
22+33+> This document formalizes a strict zero-copy direction for `opencode-session-rs`: every leaf returned by tree/projection APIs must be an addressable region in a memory-mapped file.
44+55+## Problem
66+77+The project goal has always been memory-aware session materialization: let the kernel manage hot/cold file pages and reclaim them under pressure. The current implementation partially achieves this but still allocates eagerly in key paths.
88+99+Required invariant going forward:
1010+1111+- Any tree leaf representing a session/message/part payload is a file mapping reference (`Arc<MappedFile>`) plus byte range.
1212+- Parsing into owned structs is an explicit opt-in step, not the default tree assembly path.
1313+1414+## Current State
1515+1616+Current implementation status:
1717+1818+- Uses `mmap` via [`/src/storage/mmap.rs`](/src/storage/mmap.rs).
1919+- Uses typed parse via `serde_json::from_slice` in [`/src/storage/reader.rs`](/src/storage/reader.rs).
2020+- Builds index metadata in [`/src/index.rs`](/src/index.rs).
2121+- Exposes staged flow controls in [`/src/materializer.rs`](/src/materializer.rs) with Bon-based options.
2222+2323+What is good now:
2424+2525+1. File bytes are mmap-backed, so the OS can evict clean pages.
2626+2. Mmap cache is shared by `Arc`, avoiding duplicate mappings.
2727+3. Flow decomposition allows more selective loading than before.
2828+2929+What is still not aligned with strict zero-copy:
3030+3131+1. Typed reads eagerly deserialize JSON into owned heap structs.
3232+2. Index building still parses entities that could remain references.
3333+3. Public flow output currently returns hydrated objects (`LoadedSession`, `MessageWithParts`) by default.
3434+3535+## Target Model
3636+3737+### Principle
3838+3939+Build and return reference trees first. Hydration is layered on top.
4040+4141+```mermaid
4242+flowchart TD
4343+ Paths[StoragePaths] --> Index[Index relationships by IDs]
4444+ Index --> RefTree[Reference tree]
4545+ RefTree --> SpanLeaf[MappedSpan leaves]
4646+ SpanLeaf --> ParseOnDemand[Optional parse adapters]
4747+```
4848+4949+### Core leaf type
5050+5151+```rust
5252+pub struct MappedSpan {
5353+ pub file: Arc<MappedFile>,
5454+ pub offset: usize,
5555+ pub len: usize,
5656+}
5757+5858+impl MappedSpan {
5959+ pub fn as_bytes(&self) -> &[u8] {
6060+ &self.file.as_bytes()[self.offset..self.offset + self.len]
6161+ }
6262+}
6363+```
6464+6565+For current opencode layout (one JSON entity per file), most spans start as full-file spans:
6666+6767+- `offset = 0`
6868+- `len = file.len()`
6969+7070+This still satisfies the contract that leaves are explicit file addresses.
7171+7272+## Draft API Plan
7373+7474+### A) Add reference-first reader APIs
7575+7676+Files:
7777+7878+- [`/src/storage/reader.rs`](/src/storage/reader.rs)
7979+8080+Add:
8181+8282+- `read_span(path) -> Result<MappedSpan>`
8383+- `read_session_span(project_id, session_id)`
8484+- `read_message_span(session_id, message_id)`
8585+- `read_part_span(message_id, part_id)`
8686+8787+Keep parse helpers as adapters:
8888+8989+- `parse_span<T>(&MappedSpan) -> Result<T>`
9090+9191+### B) Add reference tree projection types
9292+9393+Files:
9494+9595+- [`/src/materializer.rs`](/src/materializer.rs)
9696+- (new) `/src/materializer/projection.rs` (or same module initially)
9797+9898+Add:
9999+100100+- `SessionRefLeaf { key, span }`
101101+- `MessageRefLeaf { key, span, part_refs }`
102102+- `PartRefLeaf { key, kind, span }`
103103+- `SessionRefTree { session, messages }`
104104+105105+### C) Make decomposed flow return ref trees by default
106106+107107+Files:
108108+109109+- [`/src/materializer.rs`](/src/materializer.rs)
110110+111111+Adjust:
112112+113113+- `run_session_flow` returns a ref-first result.
114114+- Provide explicit hydration adapters:
115115+ - `hydrate_session_info(&SessionRefLeaf)`
116116+ - `hydrate_message(&MessageRefLeaf)`
117117+ - `hydrate_part(&PartRefLeaf)`
118118+119119+### D) Preserve hydrated convenience API as wrappers
120120+121121+Files:
122122+123123+- [`/src/materializer.rs`](/src/materializer.rs)
124124+125125+Hydrated methods remain, but implemented as wrappers over ref flow + hydrate.
126126+127127+## Key Design Choices
128128+129129+### 1) Full-file spans now, subspans later
130130+131131+Decision:
132132+133133+- Start with full-file spans for all entities.
134134+135135+Why:
136136+137137+- Matches current on-disk format.
138138+- Guarantees minimal complexity for first strict-zero-copy pass.
139139+- Leaves room for future structural indexing/subspans if needed.
140140+141141+### 2) No implicit parse during tree assembly
142142+143143+Decision:
144144+145145+- Reference-tree assembly never deserializes JSON.
146146+147147+Why:
148148+149149+- Keeps memory profile predictable.
150150+- Ensures the OS, not heap ownership, controls most payload memory pressure.
151151+152152+### 3) Metadata remains index-resident
153153+154154+Decision:
155155+156156+- Keep small identity/relationship metadata resident (`SessionId`, `MessageId`, counts, ordering).
157157+158158+Why:
159159+160160+- Structural queries must stay fast and stable.
161161+- This metadata footprint is tiny compared to full payload hydration.
162162+163163+### 4) Cache lifecycle supports page reclamation
164164+165165+Decision:
166166+167167+- Keep mmap cache bounded/maintained (`prune_unused`, optional capacity policy).
168168+169169+Why:
170170+171171+- Arc retention determines mapping lifetime in-process.
172172+- Kernel page eviction works best when dead mappings are also removable from cache.
173173+174174+## Operational Behavior
175175+176176+### Memory behavior expectations
177177+178178+With the target model:
179179+180180+1. Tree build/load mostly allocates IDs and small vectors/maps.
181181+2. Payload bytes remain file-backed mmap pages.
182182+3. Pages can be reclaimed by the OS and faulted back when re-accessed.
183183+4. Hydration allocates heap only for explicitly requested entities.
184184+185185+### Failure behavior
186186+187187+Reference phase failures should surface by file/key, not generic parse errors.
188188+189189+Hydration phase failures should include entity key + path + parse context.
190190+191191+## Acceptance Criteria
192192+193193+1. A full session tree projection can be built without deserializing entity payloads.
194194+2. All payload leaves in that tree expose `MappedSpan` addresses.
195195+3. Existing hydrated APIs remain available as wrappers over the reference flow.
196196+4. Integration tests prove memory-mapped references remain valid across staged operations.
197197+5. No hidden parse path appears in reference tree code paths.
198198+199199+## Rollout Plan
200200+201201+1. Add `MappedSpan` and span readers in [`/src/storage/reader.rs`](/src/storage/reader.rs).
202202+2. Add reference leaf/tree types in materializer module.
203203+3. Switch flow planner/executor to ref-first results.
204204+4. Re-implement hydrated calls as adapters.
205205+5. Add fixtures and tests for zero-copy tree behavior.
206206+207207+## Risks and Mitigations
208208+209209+Risk: API confusion between ref and hydrated paths.
210210+211211+- Mitigation: clear naming (`*_ref_*`, `hydrate_*`) and explicit return types.
212212+213213+Risk: retained `Arc<MappedFile>` values can keep too many mappings alive.
214214+215215+- Mitigation: add periodic pruning and bounded cache policy knobs.
216216+217217+Risk: consumers assume borrowed lifetimes from parsed structs.
218218+219219+- Mitigation: avoid borrow-heavy parse API in v1; return owned parse outputs from explicit hydrate steps.
+266
doc/discovery/regenesis.md
···11+# Regenesis: API Refinement Plan After Initial Implementation
22+33+> This document defines the next refactor pass for `opencode-session-rs` after the first major API reshape, before writing full API docs.
44+55+## Context
66+77+The project has moved from initial research and architecture design into a working implementation with a cleaner, more composable API surface.
88+99+Primary design references:
1010+1111+- [`/doc/discovery/opencode-session.md`](/doc/discovery/opencode-session.md)
1212+- [`/doc/discovery/genesis.md`](/doc/discovery/genesis.md)
1313+- [`/doc/discovery/breakdown.md`](/doc/discovery/breakdown.md)
1414+- [`/doc/discovery/watchman.md`](/doc/discovery/watchman.md)
1515+1616+Recent implementation references:
1717+1818+- ergonomic decomposition commit: `db417d8954c86acc1fb4dca74a506d09c1a44efe`
1919+- brainstorm/analysis commit: `370920960456d08dfac15cfd22bc9e53bc8e59cd`
2020+- latest reshape commit: `b7cbef06`
2121+2222+## Current Shape (Post-Refactor)
2323+2424+The crate now has three clear layers:
2525+2626+1. `storage` layer for path resolution, mapped files, and typed reads
2727+2. `index` layer for metadata graph and relationship navigation
2828+3. `materializer` layer for high-level, ID-driven loading and orchestration
2929+3030+This is a strong direction, but we still need a final pass to formalize contracts and event surfaces before documenting the API as stable-for-now.
3131+3232+## Regenesis Goals
3333+3434+1. **Formal contracts**: make failure behavior and lookup guarantees explicit.
3535+2. **Composable flows**: expose staged methods so consumers can run only the pieces they need.
3636+3. **Structured change surface**: define CDC/event types that align with watch integration plans.
3737+4. **Deterministic behavior**: ensure stable ordering and predictable filtering semantics.
3838+5. **Tested guarantees**: add integration tests around malformed/missing data and partial corruption.
3939+4040+## Guiding Principles
4141+4242+- Keep domain-grouped modules; avoid flat API sprawl.
4343+- Prefer explicit return types over hidden side effects.
4444+- Let failures surface with enough structure for consumer policy decisions.
4545+- Separate identity keys (session/message/part IDs) from ordering/version keys (generation + sequence).
4646+- Preserve snapshot semantics for in-flight reads where possible.
4747+4848+## Architecture Target
4949+5050+```mermaid
5151+flowchart TD
5252+ Storage[storage module] --> IndexBuild[index builder]
5353+ IndexBuild --> IndexView[index query view]
5454+ IndexView --> MaterializerFlow[materializer staged flow]
5555+5656+ WatchSource[watch source] --> EventClassifier[event classifier]
5757+ EventClassifier --> ChangeHub[change hub]
5858+ ChangeHub --> SessionStream[session updates stream]
5959+ ChangeHub --> EntityStream[entity CDC stream]
6060+6161+ MaterializerFlow --> ChangeHub
6262+```
6363+6464+## Proposed Next-Pass Work
6565+6666+### 1) Index Contract Formalization
6767+6868+Files:
6969+7070+- [`/src/index.rs`](/src/index.rs)
7171+- [`/src/materializer.rs`](/src/materializer.rs)
7272+7373+Changes:
7474+7575+- Introduce explicit build outcomes for skipped entities (for example: unreadable JSON, invalid ID, missing linkage).
7676+- Replace silent `Err(_) => false` style paths with structured skip reasons.
7777+- Add a `BuildReport` returned by builder runs, including counters and skipped-item diagnostics.
7878+7979+Desired effect:
8080+8181+- Consumers can decide whether to tolerate partial indexes or fail fast.
8282+8383+### 2) Flow Decomposition API
8484+8585+Files:
8686+8787+- [`/src/materializer.rs`](/src/materializer.rs)
8888+- [`/src/loader.rs`](/src/loader.rs)
8989+9090+Changes:
9191+9292+- Add staged methods that mirror common user workflows:
9393+ - resolve IDs
9494+ - resolve metadata
9595+ - load payloads
9696+ - assemble projections
9797+- Provide projection structs for common bundles (for example: session + message headers, message + part summaries).
9898+9999+Desired effect:
100100+101101+- Libraries can stop at the cheapest useful stage and avoid pulling full trees by default.
102102+103103+### 3) CDC/Event Type Formalization
104104+105105+Files:
106106+107107+- [`/src/watch.rs`](/src/watch.rs)
108108+- (new) `/src/change/mod.rs`
109109+- (new) `/src/change/event.rs`
110110+- (new) `/src/change/stream.rs`
111111+112112+Changes:
113113+114114+- Introduce a stable event envelope with:
115115+ - `EventCursor { generation, seq_in_generation }`
116116+ - `ChangeEntity` keys
117117+ - `ChangeOp` verbs
118118+- Add two public stream surfaces:
119119+ - low-level entity CDC stream
120120+ - session projection update stream
121121+122122+Desired effect:
123123+124124+- Downstream libraries receive structured updates instead of inferring change meaning from raw paths.
125125+126126+### 4) Ordering and Determinism Sweep
127127+128128+Files:
129129+130130+- [`/src/storage/reader.rs`](/src/storage/reader.rs)
131131+- [`/src/storage/paths.rs`](/src/storage/paths.rs)
132132+- [`/src/index.rs`](/src/index.rs)
133133+134134+Changes:
135135+136136+- Ensure all externally-visible iteration and list APIs are explicitly ordered and documented.
137137+- Add tests asserting order stability across repeated scans.
138138+139139+Desired effect:
140140+141141+- Consumer behavior is reproducible and easier to cache.
142142+143143+### 5) Integration Test Fixtures
144144+145145+Files:
146146+147147+- (new) `/tests/fixtures/...`
148148+- (new) `/tests/index_build.rs`
149149+- (new) `/tests/materializer_flow.rs`
150150+- (new) `/tests/corruption_policy.rs`
151151+152152+Changes:
153153+154154+- Add fixture trees for valid, partial, and corrupted storage states.
155155+- Validate index/report behavior and staged flow behavior.
156156+157157+Desired effect:
158158+159159+- Refactors become safer and API contracts are enforced by tests.
160160+161161+## Proposed Domain Grouping (Next Structure)
162162+163163+```text
164164+src/
165165+ core/
166166+ error.rs
167167+ id.rs
168168+ storage/
169169+ paths.rs
170170+ mmap.rs
171171+ reader.rs
172172+ index/
173173+ model.rs
174174+ builder.rs
175175+ query.rs
176176+ report.rs
177177+ materialize/
178178+ session.rs
179179+ flow.rs
180180+ projection.rs
181181+ change/
182182+ event.rs
183183+ stream.rs
184184+ backend.rs
185185+ types/
186186+ session.rs
187187+ message.rs
188188+ part.rs
189189+```
190190+191191+Notes:
192192+193193+- This keeps responsibilities grouped by domain, not by technical utility alone.
194194+- Breaking changes are acceptable while crate version is `<1.0`.
195195+196196+## API Contract Draft (Before Docs)
197197+198198+### Identity keys
199199+200200+- Session: `(project_id, session_id)`
201201+- Message: `(session_id, message_id)`
202202+- Part: `(message_id, part_id)`
203203+204204+### Version/order keys
205205+206206+- `generation` (batch-level ordering boundary)
207207+- `seq_in_generation` (within-batch ordering)
208208+209209+### Error policy
210210+211211+- Parsing and linkage issues should be represented as typed errors or skip reports.
212212+- Avoid collapsing distinct failure modes into generic `NotFound` where context is available.
213213+214214+## Acceptance Criteria for This Regenesis Pass
215215+216216+1. Builder returns an explicit report for indexed + skipped entities.
217217+2. Materializer exposes staged flow methods without requiring full tree assembly.
218218+3. Change/event envelope and stream traits are public and test-covered.
219219+4. Public list/iteration APIs document and enforce deterministic ordering.
220220+5. Integration tests cover valid, partial, and corrupted storage trees.
221221+6. README usage examples align with final staged API naming.
222222+223223+## Out of Scope (This Pass)
224224+225225+- Full watchman runtime implementation details.
226226+- Durable CDC event log persistence.
227227+- Full API docs text and exhaustive rustdoc examples.
228228+229229+Those come after this pass completes.
230230+231231+## Post-Flow-Decomposition Next Options
232232+233233+Flow decomposition is now implemented with staged planning/execution APIs and Bon-based options construction. The next options to pursue before API docs are:
234234+235235+### Option A: Flow and Index Diagnostics
236236+237237+Scope:
238238+239239+- add `FlowReport` and `BuildReport` outputs
240240+- record indexed counts, skipped counts, and structured skip reasons
241241+242242+Value:
243243+244244+- consumers can choose strict vs tolerant policies without guessing from partial results
245245+246246+### Option B: Integration Fixtures for Partial/Corrupt Trees
247247+248248+Scope:
249249+250250+- add fixture-backed integration tests for valid, partial, and corrupted storage states
251251+- assert behavior of `include_*` flags and `message_limit` / `part_limit_per_message`
252252+253253+Value:
254254+255255+- locks in behavior under real-world filesystem drift and malformed JSON
256256+257257+### Option C: Typed Projection Outputs to Reduce Overfetch
258258+259259+Scope:
260260+261261+- add projection structs for metadata-only and partially hydrated reads
262262+- make projection-building methods explicit in materializer flow API
263263+264264+Value:
265265+266266+- consumer libraries can avoid full-tree loads while keeping ergonomic, typed responses
+4-1
src/lib.rs
···1515 MessageMeta, MessageRole, PartKind, PartRef, SessionIndex, SessionIndexBuilder, SessionMeta,
1616};
1717pub use loader::{LoadedSession, MessageWithParts, SessionLoader, SessionTree};
1818-pub use materializer::{SessionMaterializer, Stats as MaterializerStats};
1818+pub use materializer::{
1919+ MessageFlowScope, SessionFlowOptions, SessionFlowResult, SessionFlowScope, SessionMaterializer,
2020+ Stats as MaterializerStats,
2121+};
1922pub use storage::{FileReader, MappedFile, MappedFileCache, StoragePaths};
2023pub use types::{
2124 message::{AssistantMessage, FileDiff, Message, UserMessage},