commits
Update consumer documentation to reflect the implemented Option C direction:
- tree stages are now plan -> resolve -> hydrate
- resolve stage exposes mmap-backed MappedSpan leaves
- hydration is explicit and layered
Rewrite the decomposed flow example to use plan_session_tree, resolve_session_tree, and hydrate_session_tree, including an example of reading raw bytes directly from a mapped leaf.
Also document that registry internals now use papaya for index/cache maps.
Refactor run_session_flow to build on the new staged tree pipeline instead of directly executing bespoke load loops. The convenience layer now derives scope/message-scope outputs from SessionTreePlan and uses run_session_tree_flow to obtain hydrated results.
Add run_session_tree_flow as a composed wrapper over plan -> resolve -> hydrate, and merge stage issues into a single surfaced report path. This keeps the old ergonomic entrypoints while making staged behavior canonical under the hood.
Implement the Option C direction by introducing explicit staged tree methods that separate planning, reference resolution, and hydration:
- plan_session_tree -> StageResult<SessionTreePlan>
- resolve_session_tree -> StageResult<SessionRefTree>
- hydrate_session_tree -> StageResult<SessionHydratedTree>
This adds reference-tree node types (SessionRefNode/MessageRefNode/PartRefNode) and plan types keyed by canonical key structs. The resolve stage now maps files into MappedSpan leaves instead of eagerly deserializing payloads.
In the storage layer, add MappedSpan and span helpers (read_span/read_session_span/read_message_span/read_part_span/read_diff_span + parse_span). read_json now routes through span parsing, so the zero-copy substrate is used consistently.
Also wire new public exports for the staged/ref-tree API surface and mapped span type.
Replace std/locking map usage in SessionIndex and MappedFileCache with papaya HashMap-backed registries. This aligns registry internals with the recombination direction and prepares for watch-driven concurrent mutation patterns.
Key updates:
- SessionIndex registries now use papaya for session/message/part maps and reverse-index maps.
- Index read APIs were made clone-based to avoid leaking guard lifetimes across module boundaries.
- MappedFileCache now uses papaya pin-based operations for get/insert/remove/clear/prune.
- Materializer signatures were adjusted to consume owned index outputs (Vec/clone) cleanly.
Behavior remains equivalent from a consumer perspective while reducing lock management and unifying registry infrastructure for upcoming ref-tree flow work.
Add shared key types (SessionKey, MessageKey, PartKey, EntityKey) and shared stage/report envelope types (StageName, StageIssue, StageReport, StageResult). These are the foundation for recomb-based API convergence across index, flow, and planned CDC layers.
Also restore src/index.rs after earlier intermediate deletion so the crate continues to compile while we iterate toward papaya-backed registries and zero-copy reference trees.
Replace directory-listing approach with targeted updates from watchman
subscription metadata. Hot tier fed directly from watchman mtime/size
fields or fallback stat calls. Cold tier uses mtime comparison for
staleness. Includes watcher integration sketch.
Five alternatives: eager channel reload, content-hash validation,
two-tier hot/cold index, coarse epoch invalidation, Arc eviction.
Each with code sketch, pros/cons, and a comparison table.
Replaces the previous connection/subscription-focused watchman.md with a
design centered on generation-based dirty tracking and lazy reload.
Key design elements:
- GenerationClock: monotonic atomic counter, avoids ABA flag problems
- DirtyTracker: maps paths to the generation they were dirtied at
- TrackedMappedFileCache: generation-aware cache that reloads on access
- Watcher task: background tokio task feeding DirtyTracker from watchman
- 5-phase integration plan from primitives through graceful degradation
SessionIndex:
- SessionMeta, MessageMeta, PartRef types for lightweight indexing
- Build index by scanning storage directories
- Relationship maps: by_session, by_message, by_project
- Lookup methods: session(), message(), part(), sessions_for_project(), etc.
SessionMaterializer:
- Combines index with FileReader for efficient access
- Lazy content loading via mmap cache
- load_session_tree() for complete session assembly
- Query methods: sessions_by_time(), sessions_updated_since()
- Stats for monitoring index/cache size
SessionLoader provides:
- load_session: Load session info with optional diff
- load_messages/parts: Load all entities for a session/message
- load_message_with_parts: Message with its parts
- load_session_tree: Complete session with all messages and parts
- list_* methods for discovering available entities
Also exports LoadedSession, MessageWithParts, SessionTree structs.
Phase 1 - Core Foundation:
- core-id: SessionId, MessageId, PartId types with timestamp extraction
- core-type: SessionInfo, Message (User/Assistant), Part (12 variants)
- core-err: Error enum with all variants, Result alias
- stor-path: StoragePaths with XDG-compliant path resolution
Phase 2 - File Reading:
- stor-mmap: MappedFile wrapper and MappedFileCache
- stor-read: FileReader with list/read methods for all entity types
All types implement serde Serialize/Deserialize. 10 tests passing.
Define phased development approach:
- Phase 1: Core Foundation (core-id, core-type, core-err, stor-path)
- Phase 2: File Reading (stor-mmap, stor-read, pars-sess, pars-part)
- Phase 3: Session Assembly (load-sess, load-msg, load-part)
- Phase 4: Index & Materializer (idx-meta, idx-build, mat-sess, mat-query)
- Phase 5: Real-time Watching (watch-ev, watch-sess, watch-idx)
Includes dependency graph and milestone deliverables.
Research opencode session storage format and design a Rust crate architecture:
- Analyzed anomalyco/opencode for on-disk session file format
- Analyzed chriswritescode-dev/opencode-manager for API patterns
- Documented session/message/part JSON schemas and storage layout
- Designed zero-copy materialized session architecture using mmap
- Researched watchman_client for real-time file watching
- Documented fallback strategies (notify crate, polling)
Update consumer documentation to reflect the implemented Option C direction:
- tree stages are now plan -> resolve -> hydrate
- resolve stage exposes mmap-backed MappedSpan leaves
- hydration is explicit and layered
Rewrite the decomposed flow example to use plan_session_tree, resolve_session_tree, and hydrate_session_tree, including an example of reading raw bytes directly from a mapped leaf.
Also document that registry internals now use papaya for index/cache maps.
Refactor run_session_flow to build on the new staged tree pipeline instead of directly executing bespoke load loops. The convenience layer now derives scope/message-scope outputs from SessionTreePlan and uses run_session_tree_flow to obtain hydrated results.
Add run_session_tree_flow as a composed wrapper over plan -> resolve -> hydrate, and merge stage issues into a single surfaced report path. This keeps the old ergonomic entrypoints while making staged behavior canonical under the hood.
Implement the Option C direction by introducing explicit staged tree methods that separate planning, reference resolution, and hydration:
- plan_session_tree -> StageResult<SessionTreePlan>
- resolve_session_tree -> StageResult<SessionRefTree>
- hydrate_session_tree -> StageResult<SessionHydratedTree>
This adds reference-tree node types (SessionRefNode/MessageRefNode/PartRefNode) and plan types keyed by canonical key structs. The resolve stage now maps files into MappedSpan leaves instead of eagerly deserializing payloads.
In the storage layer, add MappedSpan and span helpers (read_span/read_session_span/read_message_span/read_part_span/read_diff_span + parse_span). read_json now routes through span parsing, so the zero-copy substrate is used consistently.
Also wire new public exports for the staged/ref-tree API surface and mapped span type.
Replace std/locking map usage in SessionIndex and MappedFileCache with papaya HashMap-backed registries. This aligns registry internals with the recombination direction and prepares for watch-driven concurrent mutation patterns.
Key updates:
- SessionIndex registries now use papaya for session/message/part maps and reverse-index maps.
- Index read APIs were made clone-based to avoid leaking guard lifetimes across module boundaries.
- MappedFileCache now uses papaya pin-based operations for get/insert/remove/clear/prune.
- Materializer signatures were adjusted to consume owned index outputs (Vec/clone) cleanly.
Behavior remains equivalent from a consumer perspective while reducing lock management and unifying registry infrastructure for upcoming ref-tree flow work.
Add shared key types (SessionKey, MessageKey, PartKey, EntityKey) and shared stage/report envelope types (StageName, StageIssue, StageReport, StageResult). These are the foundation for recomb-based API convergence across index, flow, and planned CDC layers.
Also restore src/index.rs after earlier intermediate deletion so the crate continues to compile while we iterate toward papaya-backed registries and zero-copy reference trees.
Replaces the previous connection/subscription-focused watchman.md with a
design centered on generation-based dirty tracking and lazy reload.
Key design elements:
- GenerationClock: monotonic atomic counter, avoids ABA flag problems
- DirtyTracker: maps paths to the generation they were dirtied at
- TrackedMappedFileCache: generation-aware cache that reloads on access
- Watcher task: background tokio task feeding DirtyTracker from watchman
- 5-phase integration plan from primitives through graceful degradation
SessionIndex:
- SessionMeta, MessageMeta, PartRef types for lightweight indexing
- Build index by scanning storage directories
- Relationship maps: by_session, by_message, by_project
- Lookup methods: session(), message(), part(), sessions_for_project(), etc.
SessionMaterializer:
- Combines index with FileReader for efficient access
- Lazy content loading via mmap cache
- load_session_tree() for complete session assembly
- Query methods: sessions_by_time(), sessions_updated_since()
- Stats for monitoring index/cache size
SessionLoader provides:
- load_session: Load session info with optional diff
- load_messages/parts: Load all entities for a session/message
- load_message_with_parts: Message with its parts
- load_session_tree: Complete session with all messages and parts
- list_* methods for discovering available entities
Also exports LoadedSession, MessageWithParts, SessionTree structs.
Phase 1 - Core Foundation:
- core-id: SessionId, MessageId, PartId types with timestamp extraction
- core-type: SessionInfo, Message (User/Assistant), Part (12 variants)
- core-err: Error enum with all variants, Result alias
- stor-path: StoragePaths with XDG-compliant path resolution
Phase 2 - File Reading:
- stor-mmap: MappedFile wrapper and MappedFileCache
- stor-read: FileReader with list/read methods for all entity types
All types implement serde Serialize/Deserialize. 10 tests passing.
Define phased development approach:
- Phase 1: Core Foundation (core-id, core-type, core-err, stor-path)
- Phase 2: File Reading (stor-mmap, stor-read, pars-sess, pars-part)
- Phase 3: Session Assembly (load-sess, load-msg, load-part)
- Phase 4: Index & Materializer (idx-meta, idx-build, mat-sess, mat-query)
- Phase 5: Real-time Watching (watch-ev, watch-sess, watch-idx)
Includes dependency graph and milestone deliverables.
Research opencode session storage format and design a Rust crate architecture:
- Analyzed anomalyco/opencode for on-disk session file format
- Analyzed chriswritescode-dev/opencode-manager for API patterns
- Documented session/message/part JSON schemas and storage layout
- Designed zero-copy materialized session architecture using mmap
- Researched watchman_client for real-time file watching
- Documented fallback strategies (notify crate, polling)