docs(journal): introduce three-layer architecture vocabulary

+128 -60

1 changed file

expand all

JOURNAL.md

+128 -60

JOURNAL.md

··· 1 1 # Sunstone Journal Guide 2 2 3 - This document describes the layout of a **journal** directory where all audio, screen and analysis artifacts are stored. Each dated `YYYYMMDD` folder is referred to as a **day**, and within each day captured content is organized into **segments** (timestamped duration folders). Each segment folder uses the format `HHMMSS_LEN/` where `HHMMSS` is the start time and `LEN` is the duration in seconds. This folder name serves as the **segment key**, uniquely identifying the segment within a given day. 3 + This document describes the layout of a **journal** directory where all captures, extracts, and insights are stored. Each dated `YYYYMMDD` folder is referred to as a **day**, and within each day captured content is organized into **segments** (timestamped duration folders). Each segment folder uses the format `HHMMSS_LEN/` where `HHMMSS` is the start time and `LEN` is the duration in seconds. This folder name serves as the **segment key**, uniquely identifying the segment within a given day. 4 + 5 + ## The Three-Layer Architecture 6 + 7 + Sunstone transforms raw recordings into actionable understanding through a three-layer pipeline: 8 + 9 + ``` 10 + ┌─────────────────────────────────────┐ 11 + │ LAYER 3: INSIGHTS │ Narrative summaries 12 + │ (Markdown files) │ "What it means" 13 + │ - topics/*.md (daily insights) │ 14 + │ - screen.md (segment insights) │ 15 + └─────────────────────────────────────┘ 16 + ↑ synthesized from 17 + ┌─────────────────────────────────────┐ 18 + │ LAYER 2: EXTRACTS │ Structured data 19 + │ (JSON/JSONL files) │ "What happened" 20 + │ - audio.jsonl (transcripts) │ 21 + │ - screen.jsonl (frame analysis) │ 22 + │ - occurrences.json (events) │ 23 + └─────────────────────────────────────┘ 24 + ↑ derived from 25 + ┌─────────────────────────────────────┐ 26 + │ LAYER 1: CAPTURES │ Raw recordings 27 + │ (Binary media files) │ "What was recorded" 28 + │ - *.flac (audio) │ 29 + │ - *.webm (video) │ 30 + └─────────────────────────────────────┘ 31 + ``` 32 + 33 + ### Vocabulary Quick Reference 34 + 35 + **Pipeline Layers** 36 + 37 + | Term | Definition | Examples | 38 + |------|------------|----------| 39 + | **Capture** | Raw audio/video recording | `*.flac`, `*.webm` | 40 + | **Extract** | Structured data from captures | `*.jsonl`, `occurrences.json` | 41 + | **Insight** | AI-generated narrative summary | `topics/*.md`, `screen.md` | 42 + 43 + **Organization** 44 + 45 + | Term | Definition | Examples | 46 + |------|------------|----------| 47 + | **Day** | 24-hour activity directory | `20250119/` | 48 + | **Segment** | 5-minute time window | `143022_300/` (14:30:22, 5 min) | 49 + | **Facet** | Project/context scope | `#work`, `#personal` | 50 + 51 + **Extracted Data** 52 + 53 + | Term | Definition | Examples | 54 + |------|------------|----------| 55 + | **Entity** | Tracked person/project/concept | People, companies, tools | 56 + | **Occurrence** | Time-based event | Meetings, messages, files | 4 57 5 58 ## Top level files 6 59 ··· 158 211 159 212 Daily entity detection files (`entities/YYYYMMDD.jsonl`) contain entities automatically discovered by agents from: 160 213 - Journal transcripts and screen captures 161 - - Knowledge graphs and summaries 214 + - Knowledge graphs and insights 162 215 - News feeds and external content 163 216 164 217 Detected entities accumulate historical context over time. Entities appearing in multiple daily detections can be promoted to attached status through the web UI or MCP tools. ··· 409 462 410 463 - `HHMMSS_LEN/` – Start time and duration in seconds (e.g., `143022_300/` for a 5-minute segment starting at 14:30:22) 411 464 412 - Audio capture tools write FLAC files and transcripts: 465 + Each segment progresses through the three-layer pipeline: captures are recorded, extracts are generated, and insights are synthesized. 466 + 467 + ### Layer 1: Captures 468 + 469 + Captures are the original binary media files recorded by observation tools. 470 + 471 + #### Audio captures 472 + 473 + Audio files are initially written to the day root with the segment key prefix: 474 + 475 + - `HHMMSS_LEN_*.flac` – audio files in day root (e.g., `143022_300_audio.flac`) 476 + 477 + After transcription, audio files are moved into their segment folder: 478 + 479 + - `HHMMSS_LEN/*.flac` – audio files moved here after processing, preserving descriptive suffix (e.g., `audio.flac`, `mic.flac`) 480 + 481 + Note: The descriptive portion after the segment key (e.g., `_audio`, `_recording`) is preserved when files are moved into segment directories. Processing tools match files by extension only, ignoring the descriptive suffix. 482 + 483 + #### Screen captures 413 484 414 - - `HHMMSS_LEN_*.flac` – audio files in day root (e.g., `143022_300_audio.flac`), moved to segment after transcription. 415 - - `HHMMSS_LEN/*.flac` – audio files moved here after processing, preserving descriptive suffix (e.g., `audio.flac`, `mic.flac`). 416 - - `HHMMSS_LEN/audio.jsonl` – transcript JSONL produced by transcription. 485 + Screen recordings follow the same pattern: 417 486 418 - Note: The descriptive portion after the segment (e.g., `_audio`, `_recording`) is preserved when files are moved into segment directories. Processing tools match files by extension only, ignoring the descriptive suffix. 487 + - `HHMMSS_LEN_*.webm` – screencast video files in day root (e.g., `143022_300_screen.webm`) 488 + - `HHMMSS_LEN/*.webm` – video files moved here after analysis, preserving descriptive suffix (e.g., `screen.webm`, `monitor1.webm`) 419 489 420 - ### Audio transcript output 490 + Videos contain monitor layout information in their metadata title field using the format: 491 + ``` 492 + DP-3:center,1920,0,5360,1440 HDMI-4:right,5360,219,7280,1299 493 + ``` 421 494 422 - The transcript file (`*_audio.jsonl`) contains a metadata line followed by one JSON object per transcript segment. 495 + Each monitor entry: `<monitor_name>:<position>,<x1>,<y1>,<x2>,<y2>` where coordinates define the monitor's bounding box in the combined virtual screen space. 496 + 497 + ### Layer 2: Extracts 498 + 499 + Extracts are structured data files (JSON/JSONL) derived from captures through AI analysis. 500 + 501 + #### Audio transcript extracts 502 + 503 + The transcript file (`audio.jsonl`) contains a metadata line followed by one JSON object per transcript segment. 423 504 424 505 Example transcript file: 425 506 ··· 445 526 - `speaker` – speaker identifier, numeric or string (optional) 446 527 - `description` – audio-impaired style description of tone, emotion, vocal quality (optional) 447 528 448 - Screen capture produces screencast videos with multi-monitor metadata: 449 - 450 - - `HHMMSS_LEN_*.webm` – screencast video files in day root (e.g., `143022_300_screen.webm`), moved to segment after analysis. 451 - - `HHMMSS_LEN/*.webm` – video files moved here after analysis, preserving descriptive suffix (e.g., `screen.webm`, `monitor1.webm`). 452 - - `HHMMSS_LEN/screen.jsonl` – vision analysis results in JSON Lines format. 453 - - `HHMMSS_LEN/screen.md` – human-readable markdown summary of the video. 529 + #### Screen frame extracts 454 530 455 - Note: Like audio files, the descriptive portion is preserved when files are moved into segment directories. 456 - 457 - ### Screencast video format 458 - 459 - Videos contain monitor layout information in their metadata title field using the format: 460 - ``` 461 - DP-3:center,1920,0,5360,1440 HDMI-4:right,5360,219,7280,1299 462 - ``` 463 - 464 - Each monitor entry: `<monitor_name>:<position>,<x1>,<y1>,<x2>,<y2>` where coordinates define the monitor's bounding box in the combined virtual screen space. 465 - 466 - ### Vision analysis output 467 - 468 - The analysis file (`*_screen.jsonl`) contains one JSON object per qualified frame. Frames qualify when they contain a changed region of at least 400×400 pixels, detected using block-based SSIM comparison. 531 + The analysis file (`screen.jsonl`) contains one JSON object per qualified frame. Frames qualify when they contain a changed region of at least 400×400 pixels, detected using block-based SSIM comparison. 469 532 470 533 Example frame record: 471 534 ··· 505 568 2. Text extraction triggered for categories: messaging, browsing, reading, productivity 506 569 3. Meeting analysis triggered for meeting category, provides full-screen participant detection with entity recognition 507 570 508 - ### Summary generation 509 - 510 - After all frames are processed, a markdown summary (`*_screen.md`) is generated from the analysis file. The summary provides a chronological narrative of the screencast, organizing frames by timestamp and including visual descriptions, extracted text, and meeting analysis where applicable. 511 - 512 - Post‑processing commands may generate additional analysis files, for example: 513 - 514 - - `topics/flow.md` – high level summary of the day. 515 - - `topics/knowledge_graph.md` – knowledge graph / network summary. 516 - - `topics/meetings.md` – meeting list used by the calendar web UI. 517 - - `task_log.txt` – log of tasks for that day in `[epoch]\tmessage` format. 518 - 519 - ### Crumbs 520 - 521 - Most generated files are accompanied by a `.crumb` file capturing dependencies and model information. See `CRUMBS.md` for the format. Example: `20250610/topics/flow.md.crumb`. 522 - 523 - ## Occurrence JSON 571 + #### Occurrence extracts 524 572 525 - Several `think/topics` prompts extract time based events from the day's 526 - transcripts—meetings, messages, follow ups, file activity and more. To index 527 - these consistently the results can be normalised into an **occurrence** container 528 - stored as `occurrences.json` inside each day folder. 573 + Insight generation prompts extract time-based events from the day's transcripts—meetings, messages, follow-ups, file activity and more. These are normalized into an **occurrence** container stored as `occurrences.json` inside each day folder. 529 574 530 575 ```json 531 576 { ··· 547 592 } 548 593 ``` 549 594 550 - ### Common fields 595 + **Common fields:** 596 + - **type** – the kind of occurrence such as `meeting`, `message`, `file`, `followup`, `documentation`, `research`, `media`, etc. 597 + - **source** – the insight file the occurrence was extracted from 598 + - **start** and **end** – HH:MM:SS timestamps containing the occurrence 599 + - **title** and **summary** – short text for display and search 600 + - **facet** – facet name the occurrence is associated with (e.g., "work", "personal", "ml_research") 601 + - **work** – boolean, work vs. personal classification when known 602 + - **participants** – optional list of people or entities involved 603 + - **details** – free-form object with occurrence-specific information 604 + 605 + This structure allows the indexer to collect and search occurrences across all days. 606 + 607 + ### Layer 3: Insights 608 + 609 + Insights are AI-generated markdown files that provide human-readable narratives synthesized from captures and extracts. 610 + 611 + #### Segment insights 612 + 613 + After all frames are processed, a segment insight (`screen.md`) is generated from the frame analysis extracts. This provides a chronological narrative of the screencast, organizing frames by timestamp and including visual descriptions, extracted text, and meeting analysis where applicable. 614 + 615 + - `HHMMSS_LEN/screen.md` – narrative summary of the segment's screen activity 616 + 617 + #### Daily insights 551 618 552 - - **type** – the kind of occurrence such as `meeting`, `message`, `file`, `followup`, `documentation`, `research`, `media`, etc. 553 - - **source** - the file the occurence was extracted from. 554 - - **start** and **end** – HH:MM:SS timestamps containing the occurence. 555 - - **title** and **summary** – short text for display and search. 556 - - **facet** – facet name the occurrence is associated with (e.g., "work", "personal", "ml_research"). 557 - - **work** – boolean, work vs. personal classification when known. 558 - - **participants** – optional list of people or entities involved. 559 - - **details** – free-form string of other occurrence specific information. 619 + Post-processing generates day-level insights that synthesize all segments: 620 + 621 + - `topics/flow.md` – day overview and work rhythm analysis 622 + - `topics/knowledge_graph.md` – entity relationships and knowledge network 623 + - `topics/meetings.md` – meeting list used by the calendar web UI 624 + - Additional topic-based insights as configured in `think/topics/` 625 + 626 + Each insight type has a corresponding template in `think/topics/{name}.txt` that defines how the AI synthesizes extracts into narrative form. 627 + 628 + #### Provenance 560 629 561 - Each topic analysis can map its findings into this structure allowing the 562 - indexer to collect and search occurrences across all days. 630 + Most insights are accompanied by a `.crumb` file capturing source dependencies and model information. See **CRUMBS.md** for the format specification. Example: `20250610/topics/flow.md.crumb`.

Configure Feed

Configure Feed