Merge branch 'main' of github.com:kognova/sunstone

+2 -2

README.md

··· 10 10 ### Core Modules 11 11 12 12 - **Observe** 👁️👂 - Multimodal capture and analysis 13 - - `observe-gnome` - Screencast monitoring on Linux/GNOME 13 + - `observer` - Screen and audio capture (auto-detects platform) 14 14 - `observe-describe` - Analyzes visual changes with AI 15 15 - `observe-transcribe` - Transcribes audio with AI APIs 16 16 - `observe-sense` - Unified observation coordination 17 - - *Note: Requires Linux with GNOME desktop* 17 + - *Supports Linux/GNOME and macOS* 18 18 19 19 - **Think** 🧠 - Data processing and insights 20 20 - `think-insight` - Generates AI-powered insights and summaries

+1 -1

docs/CALLOSUM.md

··· 53 53 **Purpose:** Real-time stdout/stderr streaming and process exit events 54 54 55 55 ### `observe` - Multimodal capture processing events 56 - **Source:** `observe/gnome/observer.py`, `observe/sense.py`, `observe/describe.py`, `observe/transcribe.py` 56 + **Source:** `observe/observer.py` (delegates to `observe/gnome/observer.py` or `observe/macos/observer.py`), `observe/sense.py`, `observe/describe.py`, `observe/transcribe.py` 57 57 **Events:** `status`, `observing`, `detected`, `described`, `transcribed`, `observed` 58 58 **Fields:** 59 59 - `status`: Periodic state (every 5s while running)

+4 -4

docs/DOCTOR.md

··· 16 16 17 17 ```bash 18 18 # Check if supervisor services are running 19 - pgrep -af "observe-gnome|observe-sense|think-supervisor" 19 + pgrep -af "observer|observe-sense|think-supervisor" 20 20 21 21 # Check Callosum socket exists 22 22 ls -la $JOURNAL_PATH/health/callosum.sock ··· 40 40 | Service | Command | Purpose | Auto-restart | 41 41 |---------|---------|---------|--------------| 42 42 | Callosum | (in-process) | Message bus for inter-service events | No | 43 - | Observer | `observe-gnome` | Screen/audio capture | Yes | 43 + | Observer | `observer` | Screen/audio capture (platform-detected) | Yes | 44 44 | Sense | `observe-sense` | File detection, processing dispatch | Yes | 45 45 46 46 Cortex (agent execution) connects to Callosum but runs independently via `muse-cortex`. ··· 62 62 63 63 ```bash 64 64 # Tail current observer log 65 - tail -f $JOURNAL_PATH/health/observe-gnome.log 65 + tail -f $JOURNAL_PATH/health/observer.log 66 66 67 67 # Find today's logs 68 68 ls -la $JOURNAL_PATH/$(date +%Y%m%d)/health/ ··· 132 132 133 133 ```bash 134 134 # Check observer log for errors 135 - tail -50 $JOURNAL_PATH/health/observe-gnome.log | grep -i error 135 + tail -50 $JOURNAL_PATH/health/observer.log | grep -i error 136 136 137 137 # Check if observer is emitting status (supervisor.status will show stale_heartbeats) 138 138 # Health is derived from observe.status Callosum events

+5 -3

docs/OBSERVE.md

··· 6 6 7 7 | Command | Purpose | 8 8 |---------|---------| 9 - | `observe-gnome` | Screen and audio capture on Linux/GNOME | 10 - | `observe-macos` | Screen and audio capture on macOS | 9 + | `observer` | Screen and audio capture (auto-detects platform) | 10 + | `observe-gnome` | Screen and audio capture on Linux/GNOME (direct) | 11 + | `observe-macos` | Screen and audio capture on macOS (direct) | 11 12 | `observe-transcribe` | Audio transcription with speaker diarization | 12 13 | `observe-describe` | Visual analysis of screen recordings | 13 14 | `observe-sense` | Unified observation coordination | ··· 15 16 ## Architecture 16 17 17 18 ``` 18 - observe-gnome/macos (capture) 19 + observer (platform-detected capture) 19 20 ↓ 20 21 Raw media files (*.flac, *.webm) 21 22 ↓ ··· 26 27 27 28 ## Key Components 28 29 30 + - **observer.py** - Unified entry point with platform detection 29 31 - **gnome/observer.py**, **macos/observer.py** - Platform-specific capture using native APIs 30 32 - **sense.py** - File watcher that dispatches transcription and description jobs 31 33 - **transcribe.py** - Audio processing with Whisper/Rev.ai and pyannote diarization

+4 -4

muse/agents/doctor.txt

··· 43 43 ## Diagnostic Procedures 44 44 45 45 ### Quick Health Check 46 - 1. Check if supervisor services are running: `pgrep -af "observe-gnome|observe-sense|think-supervisor"` 46 + 1. Check if supervisor services are running: `pgrep -af "observer|observe-sense|think-supervisor"` 47 47 2. Check Callosum socket exists: `ls -la health/callosum.sock` 48 48 3. Check for stuck agents: `ls agents/*_active.jsonl 2>/dev/null` 49 - 4. Check observer log for recent activity: `tail -20 health/observe-gnome.log` 49 + 4. Check observer log for recent activity: `tail -20 health/observer.log` 50 50 51 51 **Healthy state:** 52 52 - All three processes running ··· 56 56 57 57 ### Service Status 58 58 Check specific service logs: 59 - - Observer: `tail -50 health/observe-gnome.log` 59 + - Observer: `tail -50 health/observer.log` 60 60 - Sense: `tail -50 health/observe-sense.log` 61 61 - Supervisor: Check for `think-supervisor` process 62 62 ··· 69 69 ### Common Issues 70 70 71 71 **Observer not capturing:** 72 - - Check log for errors: `tail -50 health/observe-gnome.log | grep -i error` 72 + - Check log for errors: `tail -50 health/observer.log | grep -i error` 73 73 - Check for recent status emissions in log (health is derived from Callosum events) 74 74 - Causes: DBus issues, screencast permissions, audio device unavailable 75 75

+130 -350

observe/macos/TODO.md

··· 2 2 3 3 This document tracks the remaining work to complete the macOS observer integration using sck-cli and ScreenCaptureKit. 4 4 5 - ## Phase 1: Activity Detection (activity.py) 5 + ## Phase 1: Activity Detection (activity.py) - DONE 6 6 7 - ### 1.1 Implement `get_idle_time_ms()` 8 - - [ ] Import PyObjC Quartz framework 9 - - [ ] Use `CGEventSourceSecondsSinceLastEventType(1, kCGAnyInputEventType)` 10 - - [ ] Convert seconds to milliseconds 11 - - [ ] Add error handling for API failures 12 - - [ ] Test on macOS system 13 - 14 - **Example:** 15 - ```python 16 - from Quartz import CGEventSourceSecondsSinceLastEventType, kCGAnyInputEventType 7 + ### 1.1 Implement `get_idle_time_ms()` (DONE) 8 + - [x] Import PyObjC Quartz framework 9 + - [x] Use `CGEventSourceSecondsSinceLastEventType(1, kCGAnyInputEventType)` 10 + - [x] Convert seconds to milliseconds 11 + - [x] Add error handling for API failures 12 + - [x] Test on macOS system 17 13 18 - def get_idle_time_ms() -> int: 19 - seconds = CGEventSourceSecondsSinceLastEventType(1, kCGAnyInputEventType) 20 - return int(seconds * 1000) 21 - ``` 14 + ### 1.2 Implement `is_screen_locked()` (DONE) 15 + - [x] Used CGSessionCopyCurrentDictionary for kCGSSessionOnConsoleKey 16 + - [x] Add error handling 17 + - [x] Test on macOS system 22 18 23 - ### 1.2 Implement `is_screen_locked()` 24 - - [ ] Research best approach for screen lock detection: 25 - - Option A: Query CGSessionCopyCurrentDictionary for kCGSSessionOnConsoleKey 26 - - Option B: Use `ioreg -c IOHIDSystem | grep HIDIdleTime` 27 - - Option C: Check display sleep state as proxy 28 - - [ ] Implement chosen method with PyObjC 29 - - [ ] Add fallback if primary method unavailable 30 - - [ ] Test with actual screen lock/unlock cycles 31 - - [ ] Handle edge cases (fast user switching, etc.) 19 + ### 1.3 Implement `is_power_save_active()` (DONE) 20 + - [x] Used CGDisplayIsAsleep(CGMainDisplayID()) 21 + - [x] Add error handling 22 + - [x] Test on macOS system 32 23 33 - ### 1.3 Implement `is_power_save_active()` 34 - - [ ] Investigate IOKit display state query 35 - - [ ] Check NSScreen APIs for display power state 36 - - [ ] Alternative: subprocess call to `pmset -g` or `system_profiler` 37 - - [ ] Return True if displays are sleeping/powered off 38 - - [ ] Test with display sleep/wake 24 + ### 1.4 Implement `is_output_muted()` (DONE) 25 + - [x] Used osascript to query volume settings 26 + - [x] Add error handling and timeout 27 + - [x] Test on macOS system 39 28 40 - ### 1.4 Implement `get_monitor_geometries()` 41 - - [ ] Import Cocoa NSScreen framework 42 - - [ ] Get all screens: `NSScreen.screens()` 43 - - [ ] For each screen: 44 - - Extract frame geometry: `screen.frame()` 45 - - Get device description for unique ID: `screen.deviceDescription()` 46 - - Handle NSScreenNumber, display ID, etc. 47 - - [ ] Compute union bounding box of all monitors 48 - - [ ] Calculate midlines (union_mid_x, union_mid_y) 49 - - [ ] Assign position labels based on intersection with midlines: 50 - - "center", "left", "right", "top", "bottom", "left-top", etc. 51 - - [ ] Return list of dicts: `[{"id": "...", "box": [x1,y1,x2,y2], "position": "..."}]` 52 - - [ ] Test with single monitor, dual monitor, triple monitor setups 53 - - [ ] Handle coordinate system (Cocoa uses bottom-left origin) 29 + ## Phase 2: ScreenCaptureKit Manager (screencapture.py) 54 30 55 - ### 1.5 Implement `get_monitor_metadata_string()` 56 - - [ ] Call `get_monitor_geometries()` 57 - - [ ] Format as: `"0:center,0,0,1920,1080 1:right,1920,0,3840,1080"` 58 - - [ ] Test output format matches GNOME format exactly 31 + **Note:** sck-cli now provides multi-display capture with JSONL metadata output to stdout. 32 + Display geometry is parsed from sck-cli output - no PyObjC monitor detection needed. 59 33 60 - ## Phase 2: ScreenCaptureKit Manager (screencapture.py) 34 + ### 2.1 JSONL Parsing (DONE) 35 + - [x] Parse sck-cli stdout for display geometry 36 + - [x] Extract displayID, x, y, width, height per display 37 + - [x] Use `assign_monitor_positions()` to compute position labels 38 + - [x] Build DisplayInfo objects with position, displayID, temp_path 61 39 62 - ### 2.1 Implement `start()` 63 - - [ ] Validate sck-cli is available in PATH or at specified path 64 - - [ ] Build command: `[sck_cli_path, str(output_base), "-r", str(frame_rate), "-l", str(duration)]` 65 - - [ ] Launch subprocess with `subprocess.Popen()` 66 - - [ ] Store process handle and output_base in instance variables 67 - - [ ] Add stderr/stdout capture for debugging 68 - - [ ] Return success/failure 69 - - [ ] Test with various parameters 40 + ### 2.2 Implement `start()` (DONE) 41 + - [x] Build command with frame rate and duration 42 + - [x] Launch subprocess and capture stdout 43 + - [x] Parse JSONL for display and audio info 44 + - [x] Return list of DisplayInfo and AudioInfo 70 45 71 - ### 2.2 Implement `stop()` 72 - - [ ] Check if process exists and is running 73 - - [ ] Send SIGTERM to process 74 - - [ ] Wait with timeout (5 seconds) for graceful shutdown 75 - - [ ] If timeout, send SIGKILL as fallback 76 - - [ ] Clear process handle and output_base 77 - - [ ] Log any stderr output from process 78 - - [ ] Test graceful and forced shutdown scenarios 46 + ### 2.3 Implement `stop()` (DONE) 47 + - [x] Send SIGTERM to process 48 + - [x] Wait with timeout for graceful shutdown 49 + - [x] SIGKILL as fallback 79 50 80 - ### 2.3 Implement `is_running()` 81 - - [ ] Check if `self.process` is not None 82 - - [ ] Use `self.process.poll()` to check if still running 83 - - [ ] Return True if running, False otherwise 51 + ### 2.4 Implement `finalize()` (DONE) 52 + - [x] Simple file rename (no metadata embedding needed) 53 + - [x] Rename per-display: `temp_displayID.mov` -> `HHMMSS_LEN_position_displayID_screen.mov` 54 + - [x] Rename audio: `temp.m4a` -> `HHMMSS_LEN_audio.m4a` 84 55 85 - ### 2.4 Implement `finalize()` 86 - - [ ] Check if temp files exist: `temp_base.mov`, `temp_base.m4a` 87 - - [ ] If files missing, log error and return failure 88 - - [ ] Add monitor metadata to video file using one of: 89 - - Option A: ffmpeg: `ffmpeg -i input.mov -metadata title="..." -c copy output.mov` 90 - - Option B: PyObjC AVFoundation APIs to modify metadata in-place 91 - - [ ] Atomically rename temp video file to final path: `os.replace()` 92 - - [ ] Atomically rename temp audio file to final path: `os.replace()` 93 - - [ ] Return tuple of (video_success, audio_success) 94 - - [ ] Handle errors gracefully (log but don't crash) 95 - - [ ] Test with actual sck-cli output files 56 + ### 2.5 Implement `get_output_size()` (DONE) 57 + - [x] Sum sizes of all display video files 58 + - [x] Used for health check file growth verification 96 59 97 - ### 2.5 Implement `get_output_size()` 98 - - [ ] Check if `self.current_output_base` is set 99 - - [ ] Build path to .mov file 100 - - [ ] Use `os.path.getsize()` to get file size 101 - - [ ] Return 0 if file doesn't exist or error 102 - - [ ] Used for health check file growth verification 60 + ## Phase 3: Main Observer (observer.py) - DONE 103 61 104 - ## Phase 3: Main Observer (observer.py) 62 + ### 3.1 Implement `setup()` (DONE) 63 + - [x] Verify sck-cli is available in PATH via shutil.which() 64 + - [x] Initialize Callosum connection 65 + - [x] Start Callosum connection 66 + - [x] Log initialization success 67 + - [x] Return True on success, False on failure 105 68 106 - ### 3.1 Implement `setup()` 107 - - [ ] Verify sck-cli is available in PATH 108 - - [ ] Create ScreenCaptureKitManager instance 109 - - [ ] Initialize Callosum connection 110 - - [ ] Start Callosum connection 111 - - [ ] Log initialization success 112 - - [ ] Return True on success, False on failure 69 + ### 3.2 Implement `check_activity_status()` (DONE) 70 + - [x] Call `get_idle_time_ms()` from activity module 71 + - [x] Call `is_screen_locked()` from activity module 72 + - [x] Call `is_output_muted()` from activity module 73 + - [x] Cache values in instance variables for status events 74 + - [x] Determine if idle: `(idle_time > IDLE_THRESHOLD_MS) or screen_locked` 75 + - [x] Return activity status 113 76 114 - ### 3.2 Implement `check_activity_status()` 115 - - [ ] Call `get_idle_time_ms()` from activity module 116 - - [ ] Call `is_screen_locked()` from activity module 117 - - [ ] Cache values in instance variables for status events 118 - - [ ] Determine if idle: `(idle_time > IDLE_THRESHOLD_MS) or screen_locked` 119 - - [ ] Set `self.cached_is_active = not is_idle` 120 - - [ ] Return activity status 77 + ### 3.3 Implement `handle_boundary()` (DONE) 78 + - [x] Get timestamp parts and calculate duration 79 + - [x] Stop capture if running 80 + - [x] Check audio threshold (3-chunk RMS logic) before saving audio 81 + - [x] Build finalization list and queue 82 + - [x] Reset timing for new window 83 + - [x] Start new capture if active and screen not locked 84 + - [x] Emit Callosum observing event with saved files 121 85 122 - ### 3.3 Implement `handle_boundary()` 123 - - [ ] Get timestamp parts and calculate duration 124 - - [ ] Get day directory path 125 - - [ ] If capture running: 126 - - Stop sck-cli via `self.screencapture.stop()` 127 - - Build temp base path (e.g., `.120000`) 128 - - Build final paths with duration (e.g., `120000_300_screen.mov`, `120000_300_audio.m4a`) 129 - - Queue for finalization: `self.pending_finalization = (temp_base, final_video, final_audio)` 130 - - Clear state variables 131 - - [ ] Reset timing: `self.start_at = time.time()`, `self.start_at_mono = time.monotonic()` 132 - - [ ] If active and screen not locked: 133 - - Call `initialize_capture()` 134 - - [ ] Build list of files that were captured 135 - - [ ] Emit Callosum event: `self.callosum.emit("observe", "observing", segment="...", files=[...])` 136 - - [ ] Log boundary handling 86 + ### 3.4 Implement `initialize_capture()` (DONE) 87 + - [x] Get timestamp for filename 88 + - [x] Build temp output base (hidden file) 89 + - [x] Start sck-cli via ScreenCaptureKitManager 90 + - [x] Store displays and audio info 91 + - [x] Initialize file size tracking 92 + - [x] Log capture start with display info 137 93 138 - ### 3.4 Implement `initialize_capture()` 139 - - [ ] Get timestamp for filename 140 - - [ ] Get day directory path 141 - - [ ] Build temp output base: `day_dir / f".{time_part}"` (hidden file) 142 - - [ ] Call `self.screencapture.start(output_base, self.interval, frame_rate=1.0)` 143 - - [ ] If success: 144 - - Set `self.capture_running = True` 145 - - Set `self.current_output_base = output_base` 146 - - Set `self.last_video_size = 0` 147 - - Log capture start 148 - - Return True 149 - - [ ] Else: 150 - - Log failure 151 - - Return False 94 + ### 3.5 Implement `emit_status()` (DONE) 95 + - [x] Build capture info dict with recording status, displays, elapsed time, files_growing 96 + - [x] Build activity info dict with active, idle_time_ms, screen_locked, output_muted 97 + - [x] Emit via Callosum 152 98 153 - ### 3.5 Implement `emit_status()` 154 - - [ ] Build capture info dict: 155 - - If capturing: `{"recording": True, "file": "...", "window_elapsed_seconds": ...}` 156 - - Else: `{"recording": False}` 157 - - [ ] Build activity info dict: `{"active": ..., "idle_time_ms": ..., "screen_locked": ...}` 158 - - [ ] Emit via Callosum: `self.callosum.emit("observe", "status", capture=..., activity=...)` 99 + ### 3.6 Implement `finalize_screencast()` (DONE) 100 + - [x] Simple file rename using os.replace() 101 + - [x] Log success/failure 159 102 160 - ### 3.6 Implement `finalize_capture()` 161 - - [ ] Check if temp files exist 162 - - [ ] If missing, log warning and return 163 - - [ ] Get monitor metadata string: `get_monitor_metadata_string()` 164 - - [ ] Call `self.screencapture.finalize(temp_base, final_video, final_audio, monitor_metadata)` 165 - - [ ] Log success/failure 166 - - [ ] Return finalization status 103 + ### 3.7 Implement `main_loop()` (DONE) 104 + - [x] Check initial activity status 105 + - [x] Start initial capture if active 106 + - [x] Main loop with CHUNK_DURATION sleep intervals 107 + - [x] Process pending finalizations 108 + - [x] Check activity status and detect activation edge 109 + - [x] Detect mute state transitions (triggers boundary like GNOME) 110 + - [x] Handle window boundaries 111 + - [x] Track file growth for health reporting 112 + - [x] Emit status events 167 113 168 - ### 3.7 Implement `main_loop()` 169 - - [ ] Check initial activity status 170 - - [ ] If active and not locked, start initial capture 171 - - [ ] Main loop while `self.running`: 172 - - Sleep for CHUNK_DURATION (5 seconds) 173 - - Process pending finalization if queued 174 - - Check activity status 175 - - Detect activation edge: `is_active and not self.capture_running` 176 - - Calculate elapsed time since window start (monotonic) 177 - - Check for boundary: `elapsed >= self.interval or activation_edge` 178 - - If boundary, call `handle_boundary(is_active)` 179 - - Track if capture files are growing (for health reporting via status event) 180 - - Emit status event with `screencast.files_growing` field (supervisor derives health from this) 181 - - [ ] Call `shutdown()` after loop exits 114 + ### 3.8 Implement `shutdown()` (DONE) 115 + - [x] Stop capture if running 116 + - [x] Check audio threshold for final segment 117 + - [x] Finalize all pending captures 118 + - [x] Stop Callosum connection 182 119 183 - ### 3.8 Implement `shutdown()` 184 - - [ ] If capture running: 185 - - Stop capture 186 - - Wait briefly (1 second) for files to be written 187 - - Build final paths 188 - - Call `finalize_capture()` for current capture 189 - - [ ] If pending finalization exists: 190 - - Wait briefly 191 - - Call `finalize_capture()` for pending 192 - - [ ] Stop Callosum connection 193 - - [ ] Log shutdown complete 120 + ### 3.9 Implement `_check_audio_threshold()` (DONE) 121 + - [x] Decode m4a with PyAV 122 + - [x] Split into 5-second chunks 123 + - [x] Compute RMS per chunk 124 + - [x] Count threshold hits (same MIN_HITS_FOR_SAVE = 3 as GNOME) 125 + - [x] Return True if enough voice activity 194 126 195 - ### 3.9 Wire up CLI arguments 196 - - [ ] Add `--sck-cli-path` argument support 197 - - [ ] Pass to ScreenCaptureKitManager constructor 198 - - [ ] Test CLI invocation: `observe-macos --interval 300` 127 + ### 3.10 Wire up CLI arguments (DONE) 128 + - [x] Pass --sck-cli-path to ScreenCaptureKitManager 199 129 200 130 ## Phase 4: Testing & Integration 201 131 ··· 208 138 - [ ] Test window boundaries and file naming 209 139 - [ ] Test graceful shutdown (Ctrl-C) 210 140 - [ ] Verify Callosum events emitted 211 - - [ ] Verify health files touched 212 141 213 142 ### 4.2 Multi-Monitor Testing 214 - - [ ] Test with single monitor 143 + - [ ] Test with single monitor (position should be "center") 215 144 - [ ] Test with dual monitors (side-by-side) 216 145 - [ ] Test with three monitors 217 - - [ ] Verify monitor metadata in video files 146 + - [ ] Verify per-display files with position labels 218 147 - [ ] Test monitor arrangement changes during capture 219 148 220 149 ### 4.3 Edge Cases ··· 228 157 229 158 ### 4.4 Integration with Downstream Tools 230 159 - [ ] Verify observe-describe works with .mov files 231 - - [ ] Verify observe-sense works with .m4a audio 232 - - [ ] Test or update tools expecting .flac to handle .m4a 233 - - [ ] Test monitor metadata parsing from video titles 160 + - [ ] Verify observe-sense dispatches .mov to describe and .m4a to transcribe 161 + - [ ] Test parse_screen_filename() with new displayID format 234 162 - [ ] Verify think-indexer handles new file formats 235 163 236 - ## Phase 5: sck-cli Enhancements 237 - 238 - ### 5.1 High Priority: Monitor Metadata Capture 239 - **Investigate:** 240 - - [ ] Can `SCShareableContent` provide display arrangement/geometry? 241 - - [ ] Research `SCDisplay` properties (displayID, width, height, frame) 242 - - [ ] Can we get display position in global coordinate space? 243 - - [ ] Can we distinguish between primary and secondary displays? 244 - 245 - **Implement:** 246 - - [ ] Add code to capture monitor geometry at capture start 247 - - [ ] Store in video metadata (QuickTime user data or title field) 248 - - [ ] Format: `"0:center,0,0,1920,1080 1:right,1920,0,3840,1080"` 249 - - [ ] Test with multiple monitor configurations 250 - - [ ] Document in sck-cli README 251 - 252 - **Benefits:** 253 - - Enables per-monitor analysis in observe-describe 254 - - Matches GNOME screencast format for compatibility 255 - - Essential for downstream processing 256 - 257 - ### 5.2 High Priority: Temp File Support 258 - **Investigate:** 259 - - [ ] Add CLI flag: `--temp` or `--hidden` 260 - - [ ] When enabled, write to `.{basename}.mov` and `.{basename}.m4a` 261 - - [ ] Python wrapper then renames after completion 262 - 263 - **Implement:** 264 - - [ ] Add flag to ArgumentParser 265 - - [ ] Modify output path construction in SCKShot.swift 266 - - [ ] Test that files are hidden on macOS (ls -a shows them) 267 - - [ ] Document flag in README 268 - 269 - **Benefits:** 270 - - Prevents file watchers from triggering on incomplete files 271 - - Cleaner integration with Sunstone's workflow 272 - - Matches GNOME observer pattern 273 - 274 - ### 5.3 High Priority: Graceful Shutdown 275 - **Investigate:** 276 - - [ ] Verify current SIGTERM/SIGINT handling 277 - - [ ] Ensure VideoWriter.finish() is called on interrupt 278 - - [ ] Ensure AudioWriter finishes both tracks properly 279 - - [ ] Test file validity after various interrupt scenarios 280 - 281 - **Implement:** 282 - - [ ] Add proper signal handlers if missing 283 - - [ ] Ensure clean shutdown path exercises all finish() methods 284 - - [ ] Test: Start capture, wait 5 sec, send SIGTERM, verify files valid 285 - - [ ] Test: Start capture, wait 30 sec, send SIGINT, verify files valid 286 - 287 - **Benefits:** 288 - - Ensures data integrity 289 - - Critical for reliable operation 290 - - Prevents corrupt files on shutdown 291 - 292 - ### 5.4 Medium Priority: Multi-Display Support 293 - **Investigate:** 294 - - [ ] Currently captures "first" display - which one exactly? 295 - - [ ] Can we capture multiple displays simultaneously? 296 - - [ ] Would require multiple StreamOutput/VideoWriter instances 297 - - [ ] Or capture combined virtual display space? 298 - 299 - **Implement:** 300 - - [ ] Add flag: `--display <id>` or `--display all` 301 - - [ ] Allow specifying which display(s) to capture 302 - - [ ] If "all", consider whether to: 303 - - Create separate files per display, or 304 - - Capture combined virtual space (current behavior?) 305 - - [ ] Document display selection in README 306 - 307 - **Benefits:** 308 - - Flexibility for multi-monitor setups 309 - - May reduce file size if only one display active 310 - - Future-proofing 311 - 312 - ### 5.5 Medium Priority: Exit Code Validation 313 - **Investigate:** 314 - - [ ] Add validation before exit: 315 - - Check output files exist 316 - - Check files have non-zero size 317 - - Check video file is valid (can open with AVFoundation) 318 - - Check audio file is valid 319 - 320 - **Implement:** 321 - - [ ] Return exit code 0 only if all validations pass 322 - - [ ] Return exit code 1 if capture failed 323 - - [ ] Return exit code 2 if files missing/corrupt 324 - - [ ] Log specific error messages to stderr 325 - 326 - **Benefits:** 327 - - Python wrapper can detect failures reliably 328 - - Better error handling and debugging 329 - - Prevents silent failures 330 - 331 - ### 5.6 Medium Priority: Metadata Embedding 332 - **Investigate:** 333 - - [ ] What metadata can be embedded in .mov container? 334 - - [ ] QuickTime user data atoms for custom fields? 335 - - [ ] Standard fields: title, comment, creation date, etc. 336 - - [ ] Can we store capture settings (frame rate, duration, display ID)? 337 - 338 - **Implement:** 339 - - [ ] Add custom metadata fields: 340 - - Capture frame rate 341 - - Capture duration (planned) 342 - - Display ID(s) captured 343 - - Monitor geometry string 344 - - sck-cli version 345 - - [ ] Use AVFoundation APIs to write metadata 346 - - [ ] Test metadata survives file copy/move 347 - - [ ] Document metadata fields 348 - 349 - **Benefits:** 350 - - Self-documenting files 351 - - Enables smarter downstream processing 352 - - Helpful for debugging capture issues 353 - 354 - ### 5.7 Low Priority: Frame Timestamp Accuracy 355 - **Investigate:** 356 - - [ ] Verify CMSampleBuffer presentation timestamps are accurate 357 - - [ ] Test frame extraction at specific timestamps 358 - - [ ] Ensure timestamps align with audio timestamps 359 - - [ ] Test with different frame rates 360 - 361 - **Implement:** 362 - - [ ] Add logging of frame timestamps if not already present 363 - - [ ] Validate timestamps against wall clock 364 - - [ ] Document timestamp behavior in README 365 - 366 - **Benefits:** 367 - - Important for visual analysis alignment 368 - - Ensures audio/video sync 369 - - Critical for accurate playback 370 - 371 - ### 5.8 Low Priority: Output Path Validation 372 - **Investigate:** 373 - - [ ] Add validation before capture starts: 374 - - Parent directory exists 375 - - Parent directory is writable 376 - - Output files don't already exist 377 - - Sufficient disk space available 164 + ## Phase 5: sck-cli (DONE) 378 165 379 - **Implement:** 380 - - [ ] Add pre-flight checks in run() method 381 - - [ ] Print clear error messages for each failure case 382 - - [ ] Exit early if validation fails 383 - - [ ] Document error messages 384 - 385 - **Benefits:** 386 - - Better user experience 387 - - Prevents wasted capture attempts 388 - - Clearer error messages 166 + All sck-cli requirements are met: 167 + - [x] Multi-display capture with per-display files 168 + - [x] JSONL metadata output to stdout 169 + - [x] Temp file support (Python passes hidden path like `.HHMMSS`) 170 + - [x] Graceful SIGTERM/SIGINT handling (verified) 171 + - [x] File validation done in Python's `finalize()` 389 172 390 173 ## Phase 6: Documentation & Polish 391 174 ··· 413 196 414 197 ## Notes 415 198 199 + ### Architecture Changes from Original Plan 200 + - **No PyObjC monitor detection needed**: sck-cli provides display geometry via JSONL stdout 201 + - **No metadata embedding**: Position/displayID encoded in filename instead 202 + - **Multi-display from day one**: sck-cli captures all displays automatically 203 + - **DisplayInfo dataclass**: Mirrors GNOME's StreamInfo pattern 204 + 205 + ### File Naming Convention 206 + - **Video**: `HHMMSS_LEN_position_displayID_screen.mov` (e.g., `120000_300_center_1_screen.mov`) 207 + - **Audio**: `HHMMSS_LEN_audio.m4a` (e.g., `120000_300_audio.m4a`) 208 + - **Temp files**: `.HHMMSS_displayID.mov`, `.HHMMSS.m4a` (hidden during capture) 209 + 416 210 ### Differences from GNOME Observer 417 211 - **Audio**: sck-cli provides synchronized .m4a instead of separate AudioRecorder 418 - - **Format**: .mov video instead of .webm (or optionally convert with ffmpeg) 212 + - **Format**: .mov video instead of .webm 419 213 - **Activity APIs**: PyObjC instead of DBus 420 214 - **Subprocess**: Manages external sck-cli process instead of direct API calls 421 - - **No RMS threshold**: Audio always captured when recording (rely on VAD post-processing) 422 - 423 - ### Key Design Decisions 424 - - Use sck-cli's native audio to avoid synchronization complexity 425 - - Mirror GNOME observer architecture for consistency 426 - - Use PyObjC for native system APIs (parallels DBus approach) 427 - - Accept .m4a format (update downstream tools if needed) 428 - - Temp file pattern (`.HHMMSS`) prevents premature file watcher triggers 215 + - **Connector ID**: Uses numeric displayID instead of connector names like "DP-3" 216 + - **No RMS threshold**: Audio always captured when recording 429 217 430 218 ### Dependencies 431 219 - sck-cli must be built and available in PATH (or specified via --sck-cli-path) 432 - - PyObjC frameworks required: core, Cocoa, Quartz 433 - - Optional: ffmpeg for video metadata manipulation (if not using PyObjC) 434 - - Optional: ffmpeg for .mov → .webm conversion (if desired) 220 + - PyObjC frameworks required: core, Cocoa, Quartz (for activity detection only) 221 + - observe.utils.assign_monitor_positions for position label computation 435 222 436 223 ### Testing Strategy 437 224 1. Start with activity.py (testable independently) 438 - 2. Then screencapture.py (can test with mock sck-cli) 225 + 2. Then screencapture.py (can test with mock sck-cli or real capture) 439 226 3. Then observer.py (integration testing) 440 227 4. Finally sck-cli enhancements (separate repo) 441 - 442 - ### Future Enhancements 443 - - Consider adding VAD post-processing to match GNOME's threshold logic 444 - - Consider .mov → .webm conversion for format consistency 445 - - Consider .m4a → .flac conversion if downstream tools require it 446 - - Add observe-macos-test command for validation 447 - - Add metrics/telemetry for capture success rates

+69 -88

observe/macos/activity.py

··· 5 5 """ 6 6 7 7 import logging 8 - from typing import Optional 8 + import subprocess 9 + 10 + from Quartz import ( 11 + CGDisplayIsAsleep, 12 + CGEventSourceSecondsSinceLastEventType, 13 + CGMainDisplayID, 14 + CGSessionCopyCurrentDictionary, 15 + kCGAnyInputEventType, 16 + ) 9 17 10 18 logger = logging.getLogger(__name__) 11 - 12 - # IDLE_THRESHOLD_MS is defined in observer.py, but useful to know the typical value 13 - # IDLE_THRESHOLD_MS = 5 * 60 * 1000 # 5 minutes 14 19 15 20 16 21 def get_idle_time_ms() -> int: ··· 27 32 >>> idle_ms = get_idle_time_ms() 28 33 >>> print(f"User idle for {idle_ms / 1000:.1f} seconds") 29 34 """ 30 - # TODO: Implement using PyObjC 31 - # from Quartz import CGEventSourceSecondsSinceLastEventType, kCGAnyInputEventType 32 - # seconds = CGEventSourceSecondsSinceLastEventType(1, kCGAnyInputEventType) 33 - # return int(seconds * 1000) 34 - logger.warning("get_idle_time_ms not yet implemented") 35 - return 0 35 + try: 36 + # kCGEventSourceStateHIDSystemState = 1 (hardware input events) 37 + seconds = CGEventSourceSecondsSinceLastEventType(1, kCGAnyInputEventType) 38 + return int(seconds * 1000) 39 + except Exception as e: 40 + logger.warning(f"Failed to get idle time: {e}") 41 + return 0 36 42 37 43 38 44 def is_screen_locked() -> bool: 39 45 """ 40 46 Check if the screen is currently locked. 41 47 42 - Queries the macOS session state to determine if the screen lock is active. 48 + Queries the macOS session state via CGSessionCopyCurrentDictionary. 49 + When the screen is locked, kCGSSessionOnConsoleKey becomes False. 43 50 44 51 Returns: 45 52 True if screen is locked, False otherwise ··· 48 55 >>> if is_screen_locked(): 49 56 ... print("Screen is locked, skipping capture") 50 57 """ 51 - # TODO: Implement using PyObjC or subprocess 52 - # Options: 53 - # 1. Check CGSessionCopyCurrentDictionary for kCGSSessionOnConsoleKey 54 - # 2. Query via `ioreg -c IOHIDSystem` 55 - # 3. Use Quartz APIs to detect locked state 56 - logger.warning("is_screen_locked not yet implemented") 57 - return False 58 + try: 59 + session_dict = CGSessionCopyCurrentDictionary() 60 + if session_dict is None: 61 + logger.warning("CGSessionCopyCurrentDictionary returned None") 62 + return False 63 + 64 + # kCGSSessionOnConsoleKey is True when user is on console (not locked) 65 + # When screen is locked, this becomes False 66 + on_console = session_dict.get("kCGSSessionOnConsoleKey", True) 67 + return not on_console 68 + except Exception as e: 69 + logger.warning(f"Failed to check screen lock status: {e}") 70 + return False 58 71 59 72 60 73 def is_power_save_active() -> bool: 61 74 """ 62 75 Check if display power save mode is active (screen blanked/sleep). 63 76 64 - Detects if displays are in sleep mode or powered off, similar to GNOME's 65 - DisplayConfig PowerSaveMode check. 77 + Uses CGDisplayIsAsleep to detect if the main display is sleeping, 78 + similar to GNOME's DisplayConfig PowerSaveMode check. 66 79 67 80 Returns: 68 81 True if power save is active (displays off), False otherwise ··· 71 84 >>> if is_power_save_active(): 72 85 ... print("Displays are sleeping") 73 86 """ 74 - # TODO: Implement display sleep detection 75 - # Options: 76 - # 1. IOKit display state query 77 - # 2. NSScreen APIs to check if displays are active 78 - # 3. subprocess call to system_profiler or pmset 79 - logger.warning("is_power_save_active not yet implemented") 80 - return False 87 + try: 88 + main_display = CGMainDisplayID() 89 + is_asleep = CGDisplayIsAsleep(main_display) 90 + return bool(is_asleep) 91 + except Exception as e: 92 + logger.warning(f"Failed to check display sleep status: {e}") 93 + return False 81 94 82 95 83 - def get_monitor_geometries() -> list[dict]: 96 + def is_output_muted() -> bool: 84 97 """ 85 - Get structured monitor information using NSScreen. 98 + Check if the system audio output is muted. 86 99 87 - Returns monitor geometry in the same format as GNOME's get_monitor_geometries() 88 - to enable downstream compatibility. 100 + Uses osascript to query macOS volume settings, similar to how GNOME 101 + uses pactl for PulseAudio mute status. 89 102 90 103 Returns: 91 - List of dicts with format: 92 - [{"id": "display-id", "box": [x1, y1, x2, y2], "position": "center|left|right|..."}, ...] 93 - where box contains [left, top, right, bottom] coordinates 104 + True if muted, False otherwise (including on error). 94 105 95 106 Example: 96 - >>> monitors = get_monitor_geometries() 97 - >>> for mon in monitors: 98 - ... print(f"{mon['id']}: {mon['position']} at {mon['box']}") 99 - display-1: center at [0, 0, 1920, 1080] 100 - display-2: right at [1920, 0, 3840, 1080] 101 - 102 - Notes: 103 - - Coordinates are in screen space (origin may be top-left or bottom-left) 104 - - Position is computed relative to union bounding box midlines 105 - - Format matches GNOME output for compatibility with existing analysis tools 106 - """ 107 - # TODO: Implement using PyObjC NSScreen 108 - # from Cocoa import NSScreen 109 - # from observe.utils import assign_monitor_positions 110 - # 111 - # Get all screens: NSScreen.screens() 112 - # For each screen: 113 - # - Get frame: screen.frame() 114 - # - Get device description for ID: screen.deviceDescription() 115 - # - Extract NSDeviceResolution, NSScreenNumber, etc. 116 - # - Build dict with "id" and "box" keys 117 - # 118 - # Use assign_monitor_positions() to add position labels 119 - # Return list matching GNOME format 120 - logger.warning("get_monitor_geometries not yet implemented") 121 - return [] 122 - 123 - 124 - def get_monitor_metadata_string() -> str: 125 - """ 126 - Format monitor geometries as a metadata string for video title. 127 - 128 - Converts monitor geometry data into the format used in GNOME screencasts: 129 - "0:center,0,0,1920,1080 1:right,1920,0,3840,1080" 130 - 131 - This string is stored in the video file's title metadata to enable per-monitor 132 - analysis in downstream tools. 133 - 134 - Returns: 135 - Formatted metadata string, or empty string if no monitors 136 - 137 - Example: 138 - >>> metadata = get_monitor_metadata_string() 139 - >>> print(metadata) 140 - "0:center,0,0,1920,1080 1:right,1920,0,3840,1080" 107 + >>> if is_output_muted(): 108 + ... print("Audio is muted") 141 109 """ 142 - geometries = get_monitor_geometries() 143 - if not geometries: 144 - return "" 110 + try: 111 + result = subprocess.run( 112 + ["osascript", "-e", "output muted of (get volume settings)"], 113 + capture_output=True, 114 + text=True, 115 + timeout=5, 116 + ) 145 117 146 - parts = [] 147 - for i, geom in enumerate(geometries): 148 - x1, y1, x2, y2 = geom["box"] 149 - position = geom["position"] 150 - parts.append(f"{i}:{position},{x1},{y1},{x2},{y2}") 118 + if result.returncode != 0: 119 + logger.warning( 120 + f"osascript failed (rc={result.returncode}): {result.stderr}" 121 + ) 122 + return False 151 123 152 - return " ".join(parts) 124 + return result.stdout.strip().lower() == "true" 125 + except subprocess.TimeoutExpired: 126 + logger.warning("osascript timed out checking mute status") 127 + return False 128 + except FileNotFoundError: 129 + logger.warning("osascript not found") 130 + return False 131 + except Exception as e: 132 + logger.warning(f"Error checking output mute status: {e}") 133 + return False

+426 -81

observe/macos/observer.py

··· 11 11 import datetime 12 12 import logging 13 13 import os 14 + import shutil 14 15 import signal 15 16 import sys 16 17 import time 17 - from pathlib import Path 18 + 19 + import av 20 + import numpy as np 18 21 19 22 from observe.macos.activity import ( 20 23 get_idle_time_ms, 21 - get_monitor_metadata_string, 24 + is_output_muted, 25 + is_power_save_active, 22 26 is_screen_locked, 23 27 ) 24 - from observe.macos.screencapture import ScreenCaptureKitManager 28 + from observe.macos.screencapture import AudioInfo, DisplayInfo, ScreenCaptureKitManager 25 29 from think.callosum import CallosumConnection 26 30 from think.utils import day_path, setup_cli 27 31 ··· 30 34 # Constants 31 35 IDLE_THRESHOLD_MS = 5 * 60 * 1000 # 5 minutes 32 36 CHUNK_DURATION = 5 # seconds 37 + RMS_THRESHOLD = 0.01 38 + MIN_HITS_FOR_SAVE = 3 39 + SAMPLE_RATE = 48000 # Standard audio sample rate 33 40 34 41 35 42 class MacOSObserver: 36 43 """macOS audio and screencast observer using ScreenCaptureKit.""" 37 44 38 - def __init__(self, interval: int = 300): 45 + def __init__(self, interval: int = 300, sck_cli_path: str = "sck-cli"): 39 46 """ 40 47 Initialize the macOS observer. 41 48 42 49 Args: 43 50 interval: Window duration in seconds (default: 300 = 5 minutes) 51 + sck_cli_path: Path to sck-cli executable 44 52 """ 45 53 self.interval = interval 46 - self.screencapture = ScreenCaptureKitManager() 54 + self.screencapture = ScreenCaptureKitManager(sck_cli_path=sck_cli_path) 47 55 self.running = True 48 56 self.callosum: CallosumConnection | None = None 49 57 ··· 51 59 self.start_at = time.time() # Wall-clock for filenames 52 60 self.start_at_mono = time.monotonic() # Monotonic for elapsed calculations 53 61 self.capture_running = False 54 - self.current_output_base = None # Base path for current capture 55 - self.pending_finalization = ( 56 - None # Tuple of (temp_base, final_video, final_audio) 57 - ) 58 - self.last_video_size = 0 # Track file size for health checks 62 + 63 + # Multi-display tracking (similar to GNOME observer) 64 + self.current_displays: list[DisplayInfo] = [] 65 + self.current_audio: AudioInfo | None = None 66 + self.pending_finalization: list[tuple[str, str]] | None = None 67 + self.last_video_sizes: dict[str, int] = {} 59 68 60 69 # Activity status cache (updated each loop) 61 70 self.cached_is_active = False 62 71 self.cached_idle_time_ms = 0 63 72 self.cached_screen_locked = False 73 + self.cached_is_muted = False 74 + self.cached_power_save = False 75 + 76 + # Mute state at segment start 77 + self.segment_is_muted = False 78 + 79 + # Health tracking 80 + self.files_growing = False 64 81 65 82 async def setup(self): 66 83 """Initialize ScreenCaptureKit and Callosum connection.""" 67 - # TODO: Implement setup 68 - # 1. Verify sck-cli is available in PATH 69 - # 2. Start Callosum connection for status events 70 - # 3. Log initialization success 71 - logger.warning("setup() not yet implemented") 72 - return False 84 + # Verify sck-cli is available 85 + sck_path = shutil.which(self.screencapture.sck_cli_path) 86 + if not sck_path: 87 + logger.error(f"sck-cli not found: {self.screencapture.sck_cli_path}") 88 + return False 89 + logger.info(f"Found sck-cli at: {sck_path}") 73 90 74 - async def check_activity_status(self) -> bool: 91 + # Start Callosum connection for status events 92 + self.callosum = CallosumConnection() 93 + self.callosum.start() 94 + logger.info("Callosum connection started") 95 + 96 + return True 97 + 98 + def check_activity_status(self) -> bool: 75 99 """ 76 100 Check system activity status and cache values. 77 101 78 102 Returns: 79 103 True if user is active (not idle and screen unlocked) 80 104 """ 81 - # TODO: Implement activity checking 82 - # 1. Call get_idle_time_ms() 83 - # 2. Call is_screen_locked() 84 - # 3. Cache values for status events 85 - # 4. Determine if active (not idle and not locked) 86 - # 5. Return activity status 87 - logger.warning("check_activity_status() not yet implemented") 88 - return False 105 + idle_time = get_idle_time_ms() 106 + screen_locked = is_screen_locked() 107 + power_save = is_power_save_active() 108 + output_muted = is_output_muted() 109 + 110 + # Cache values for status events 111 + self.cached_idle_time_ms = idle_time 112 + self.cached_screen_locked = screen_locked 113 + self.cached_power_save = power_save 114 + self.cached_is_muted = output_muted 115 + 116 + is_idle = (idle_time > IDLE_THRESHOLD_MS) or screen_locked or power_save 117 + is_active = not is_idle 118 + 119 + # Cache result 120 + self.cached_is_active = is_active 121 + 122 + return is_active 89 123 90 124 def get_timestamp_parts(self, timestamp: float = None) -> tuple[str, str]: 91 125 """ ··· 104 138 time_part = dt.strftime("%H%M%S") 105 139 return date_part, time_part 106 140 107 - async def handle_boundary(self, is_active: bool): 141 + def _check_audio_threshold(self, audio_path: str) -> bool: 142 + """ 143 + Check if audio file has enough voice activity to save. 144 + 145 + Decodes the m4a file and applies the same 3-chunk RMS threshold 146 + logic as GNOME observer uses for real-time audio. 147 + 148 + Args: 149 + audio_path: Path to the m4a audio file 150 + 151 + Returns: 152 + True if audio should be saved (enough voice activity), False otherwise 153 + """ 154 + if not os.path.exists(audio_path): 155 + logger.warning(f"Audio file not found for threshold check: {audio_path}") 156 + return False 157 + 158 + try: 159 + container = av.open(audio_path) 160 + audio_streams = list(container.streams.audio) 161 + 162 + if not audio_streams: 163 + container.close() 164 + logger.warning(f"No audio streams in {audio_path}") 165 + return False 166 + 167 + stream = audio_streams[0] 168 + sample_rate = stream.rate or SAMPLE_RATE 169 + 170 + # Decode audio and collect samples 171 + samples = [] 172 + for frame in container.decode(stream): 173 + arr = frame.to_ndarray() 174 + # Convert to mono if stereo (average channels) 175 + if arr.ndim > 1: 176 + arr = arr.mean(axis=0) 177 + samples.append(arr.flatten()) 178 + 179 + container.close() 180 + 181 + if not samples: 182 + logger.warning(f"No audio samples decoded from {audio_path}") 183 + return False 184 + 185 + # Concatenate all samples 186 + all_samples = np.concatenate(samples) 187 + 188 + # Split into CHUNK_DURATION (5 second) chunks and count threshold hits 189 + chunk_samples = int(sample_rate * CHUNK_DURATION) 190 + threshold_hits = 0 191 + 192 + for i in range(0, len(all_samples), chunk_samples): 193 + chunk = all_samples[i : i + chunk_samples] 194 + if len(chunk) == 0: 195 + continue 196 + 197 + # Compute RMS for this chunk 198 + rms = float(np.sqrt(np.mean(chunk**2))) 199 + if rms > RMS_THRESHOLD: 200 + threshold_hits += 1 201 + 202 + logger.debug( 203 + f"Audio threshold check: {threshold_hits}/{MIN_HITS_FOR_SAVE} hits" 204 + ) 205 + return threshold_hits >= MIN_HITS_FOR_SAVE 206 + 207 + except Exception as e: 208 + logger.warning(f"Error checking audio threshold for {audio_path}: {e}") 209 + # On error, keep the file (safer default) 210 + return True 211 + 212 + def handle_boundary(self, is_active: bool): 108 213 """ 109 214 Handle window boundary rollover. 110 215 111 216 Args: 112 217 is_active: Whether system is currently active 113 218 """ 114 - # TODO: Implement boundary handling 115 - # 1. Get timestamp parts and calculate duration 116 - # 2. Stop current capture if running 117 - # 3. Queue files for finalization (temp paths -> final paths with duration) 118 - # 4. Reset timing for new window 119 - # 5. Start new capture if active and screen not locked 120 - # 6. Emit Callosum observing event with saved files 121 - logger.warning("handle_boundary() not yet implemented") 219 + # Get timestamp parts for this window and calculate duration 220 + date_part, time_part = self.get_timestamp_parts(self.start_at) 221 + duration = int(time.time() - self.start_at) 222 + day_dir = day_path(date_part) 223 + 224 + saved_files: list[str] = [] 225 + finalizations: list[tuple[str, str]] = [] 226 + 227 + if self.capture_running: 228 + logger.info("Stopping previous capture") 229 + self.screencapture.stop() 230 + self.capture_running = False 231 + 232 + # Build finalization list for video files 233 + for display in self.current_displays: 234 + if os.path.exists(display.temp_path): 235 + final_name = display.final_name(time_part, duration) 236 + final_path = str(day_dir / final_name) 237 + finalizations.append((display.temp_path, final_path)) 238 + saved_files.append(final_name) 239 + 240 + # Check audio threshold before including in finalization 241 + if self.current_audio and os.path.exists(self.current_audio.temp_path): 242 + if self._check_audio_threshold(self.current_audio.temp_path): 243 + final_name = self.current_audio.final_name(time_part, duration) 244 + final_path = str(day_dir / final_name) 245 + finalizations.append((self.current_audio.temp_path, final_path)) 246 + saved_files.append(final_name) 247 + logger.info(f"Audio passed threshold check, saving: {final_name}") 248 + else: 249 + # Delete the temp audio file 250 + try: 251 + os.remove(self.current_audio.temp_path) 252 + logger.info("Audio below threshold, discarded") 253 + except OSError as e: 254 + logger.warning(f"Failed to remove audio file: {e}") 255 + 256 + # Clear state 257 + self.current_displays = [] 258 + self.current_audio = None 259 + self.last_video_sizes = {} 260 + 261 + if finalizations: 262 + self.pending_finalization = finalizations 263 + 264 + # Reset timing for new window 265 + self.start_at = time.time() 266 + self.start_at_mono = time.monotonic() 267 + 268 + # Update segment mute state 269 + self.segment_is_muted = self.cached_is_muted 270 + 271 + # Start new capture if active and screen not locked 272 + if is_active and not self.cached_screen_locked: 273 + self.initialize_capture() 274 + 275 + # Emit observing event with saved files 276 + if saved_files and self.callosum: 277 + segment = f"{time_part}_{duration}" 278 + self.callosum.emit( 279 + "observe", 280 + "observing", 281 + segment=segment, 282 + files=saved_files, 283 + ) 284 + logger.info(f"Segment observing: {segment} ({len(saved_files)} files)") 122 285 123 - async def initialize_capture(self) -> bool: 286 + def initialize_capture(self) -> bool: 124 287 """ 125 288 Start a new screencast and audio recording. 126 289 127 290 Returns: 128 291 True if capture started successfully, False otherwise 129 292 """ 130 - # TODO: Implement capture initialization 131 - # 1. Get timestamp for filename 132 - # 2. Build temp output base path (e.g., .HHMMSS) 133 - # 3. Start sck-cli via ScreenCaptureKitManager 134 - # 4. Update state tracking variables 135 - # 5. Log capture start 136 - logger.warning("initialize_capture() not yet implemented") 137 - return False 293 + date_part, time_part = self.get_timestamp_parts(self.start_at) 294 + day_dir = day_path(date_part) 295 + 296 + # Ensure day directory exists 297 + day_dir.mkdir(parents=True, exist_ok=True) 298 + 299 + # Build temp output base (hidden file) 300 + output_base = day_dir / f".{time_part}" 301 + 302 + try: 303 + displays, audio = self.screencapture.start( 304 + output_base, self.interval, frame_rate=1.0 305 + ) 306 + except RuntimeError as e: 307 + logger.error(f"Failed to start capture: {e}") 308 + return False 309 + 310 + self.current_displays = displays 311 + self.current_audio = audio 312 + self.capture_running = True 313 + self.last_video_sizes = {d.temp_path: 0 for d in displays} 314 + 315 + logger.info(f"Started capture with {len(displays)} display(s)") 316 + for display in displays: 317 + logger.info( 318 + f" Display {display.display_id}: {display.position} -> {display.temp_path}" 319 + ) 320 + if audio: 321 + logger.info(f" Audio: {audio.temp_path}") 322 + 323 + return True 138 324 139 325 def emit_status(self): 140 326 """Emit observe.status event with current state.""" 141 - # TODO: Implement status emission 142 - # 1. Build capture info dict (recording status, file path, elapsed time) 143 - # 2. Build activity info dict (active, idle_time_ms, screen_locked) 144 - # 3. Emit via Callosum 145 - logger.warning("emit_status() not yet implemented") 327 + if not self.callosum: 328 + return 329 + 330 + journal_path = os.getenv("JOURNAL_PATH", "") 331 + 332 + # Build capture info 333 + if self.capture_running and self.current_displays: 334 + elapsed = int(time.monotonic() - self.start_at_mono) 335 + displays_info = [] 336 + for display in self.current_displays: 337 + try: 338 + rel_file = ( 339 + os.path.relpath(display.temp_path, journal_path) 340 + if journal_path 341 + else display.temp_path 342 + ) 343 + except ValueError: 344 + rel_file = display.temp_path 345 + 346 + displays_info.append( 347 + { 348 + "position": display.position, 349 + "display_id": display.display_id, 350 + "file": rel_file, 351 + } 352 + ) 353 + 354 + capture_info = { 355 + "recording": True, 356 + "displays": displays_info, 357 + "window_elapsed_seconds": elapsed, 358 + "files_growing": self.files_growing, 359 + } 360 + else: 361 + capture_info = {"recording": False, "files_growing": False} 362 + 363 + # Activity info 364 + activity_info = { 365 + "active": self.cached_is_active, 366 + "idle_time_ms": self.cached_idle_time_ms, 367 + "screen_locked": self.cached_screen_locked, 368 + "power_save": self.cached_power_save, 369 + "sink_muted": self.cached_is_muted, 370 + } 371 + 372 + self.callosum.emit( 373 + "observe", 374 + "status", 375 + capture=capture_info, 376 + activity=activity_info, 377 + ) 146 378 147 - async def finalize_capture( 148 - self, temp_base: Path, final_video: Path, final_audio: Path 149 - ): 379 + def finalize_screencast(self, temp_path: str, final_path: str): 150 380 """ 151 - Add monitor metadata to video and rename files from temp to final paths. 381 + Rename capture file from temp to final path. 152 382 153 383 Args: 154 - temp_base: Temporary base path (e.g., /path/.HHMMSS) 155 - final_video: Final video path (e.g., /path/HHMMSS_300_screen.mov) 156 - final_audio: Final audio path (e.g., /path/HHMMSS_300_audio.m4a) 384 + temp_path: Temporary file path 385 + final_path: Final destination path 157 386 """ 158 - # TODO: Implement finalization 159 - # 1. Check if temp files exist 160 - # 2. Get monitor metadata string 161 - # 3. Call screencapture.finalize() to add metadata and rename 162 - # 4. Log success/failure 163 - logger.warning("finalize_capture() not yet implemented") 387 + if not os.path.exists(temp_path): 388 + logger.warning(f"Capture file not found: {temp_path}") 389 + return 390 + 391 + try: 392 + os.replace(temp_path, final_path) 393 + logger.info(f"Finalized: {final_path}") 394 + except OSError as e: 395 + logger.error(f"Failed to rename {temp_path} to {final_path}: {e}") 164 396 165 397 async def main_loop(self): 166 398 """Run the main observer loop.""" 167 - # TODO: Implement main loop 168 - # 1. Check activity and start capture if active 169 - # 2. Loop with CHUNK_DURATION sleep intervals: 170 - # a. Process pending finalization if queued 171 - # b. Check activity status 172 - # c. Detect activation edge (idle -> active transition) 173 - # d. Check for window boundary or activation edge 174 - # e. Handle boundary if needed 175 - # f. Touch health files (see, hear) 176 - # g. Emit status event 177 - # 3. Call shutdown on exit 178 - logger.warning("main_loop() not yet implemented") 399 + logger.info(f"Starting observer loop (interval={self.interval}s)") 400 + 401 + # Check initial activity and start capture if active 402 + is_active = self.check_activity_status() 403 + self.segment_is_muted = self.cached_is_muted 404 + 405 + if is_active and not self.cached_screen_locked: 406 + if not self.initialize_capture(): 407 + logger.error("Failed to start initial capture") 408 + self.running = False 409 + return 410 + 411 + while self.running: 412 + # Sleep for chunk duration 413 + await asyncio.sleep(CHUNK_DURATION) 414 + 415 + # Process pending finalizations 416 + if self.pending_finalization: 417 + for temp_path, final_path in self.pending_finalization: 418 + if os.path.exists(temp_path): 419 + self.finalize_screencast(temp_path, final_path) 420 + else: 421 + logger.warning(f"Pending file not found: {temp_path}") 422 + self.pending_finalization = None 423 + 424 + # Check activity status 425 + is_active = self.check_activity_status() 426 + 427 + # Check if sck-cli process died unexpectedly 428 + if self.capture_running and not self.screencapture.is_running(): 429 + logger.warning("Capture process died, handling boundary") 430 + self.handle_boundary(is_active) 431 + continue 432 + 433 + # Detect activation edge (idle -> active transition) 434 + activation_edge = is_active and not self.capture_running 435 + 436 + # Detect mute state transition 437 + mute_transition = self.cached_is_muted != self.segment_is_muted 438 + if mute_transition: 439 + logger.info( 440 + f"Mute state changed: " 441 + f"{'muted' if self.segment_is_muted else 'unmuted'} -> " 442 + f"{'muted' if self.cached_is_muted else 'unmuted'}" 443 + ) 444 + 445 + # Check for window boundary (use monotonic to avoid DST/clock jumps) 446 + now_mono = time.monotonic() 447 + elapsed = now_mono - self.start_at_mono 448 + is_boundary = ( 449 + (elapsed >= self.interval) or activation_edge or mute_transition 450 + ) 451 + 452 + if is_boundary: 453 + logger.info( 454 + f"Boundary: elapsed={elapsed:.1f}s edge={activation_edge} " 455 + f"mute_change={mute_transition}" 456 + ) 457 + self.handle_boundary(is_active) 458 + 459 + # Check if capture files are actively growing (health indicator) 460 + if self.capture_running and self.current_displays: 461 + any_growing = False 462 + for display in self.current_displays: 463 + if os.path.exists(display.temp_path): 464 + current_size = os.path.getsize(display.temp_path) 465 + last_size = self.last_video_sizes.get(display.temp_path, 0) 466 + if current_size > last_size: 467 + any_growing = True 468 + self.last_video_sizes[display.temp_path] = current_size 469 + self.files_growing = any_growing 470 + else: 471 + self.files_growing = False 472 + 473 + # Emit status event 474 + self.emit_status() 475 + 476 + # Cleanup on exit 477 + logger.info("Observer loop stopped, cleaning up...") 478 + await self.shutdown() 179 479 180 480 async def shutdown(self): 181 481 """Clean shutdown of observer.""" 182 - # TODO: Implement shutdown 183 - # 1. Stop current capture if running 184 - # 2. Wait for files to be written 185 - # 3. Finalize pending captures 186 - # 4. Stop Callosum connection 187 - # 5. Log shutdown complete 188 - logger.warning("shutdown() not yet implemented") 482 + # Stop capture if running 483 + if self.capture_running: 484 + logger.info("Stopping capture for shutdown") 485 + self.screencapture.stop() 486 + 487 + # Brief delay for files to be written 488 + await asyncio.sleep(0.5) 489 + 490 + # Get timestamp parts for finalization 491 + date_part, time_part = self.get_timestamp_parts(self.start_at) 492 + duration = int(time.time() - self.start_at) 493 + day_dir = day_path(date_part) 494 + 495 + # Finalize video files 496 + for display in self.current_displays: 497 + if os.path.exists(display.temp_path): 498 + final_name = display.final_name(time_part, duration) 499 + final_path = str(day_dir / final_name) 500 + self.finalize_screencast(display.temp_path, final_path) 501 + 502 + # Check and finalize audio if threshold met 503 + if self.current_audio and os.path.exists(self.current_audio.temp_path): 504 + if self._check_audio_threshold(self.current_audio.temp_path): 505 + final_name = self.current_audio.final_name(time_part, duration) 506 + final_path = str(day_dir / final_name) 507 + self.finalize_screencast(self.current_audio.temp_path, final_path) 508 + else: 509 + try: 510 + os.remove(self.current_audio.temp_path) 511 + logger.info("Final audio below threshold, discarded") 512 + except OSError: 513 + pass 514 + 515 + self.capture_running = False 516 + 517 + # Process any remaining pending finalizations 518 + if self.pending_finalization: 519 + await asyncio.sleep(0.5) 520 + for temp_path, final_path in self.pending_finalization: 521 + if os.path.exists(temp_path): 522 + self.finalize_screencast(temp_path, final_path) 523 + self.pending_finalization = None 524 + 525 + # Stop Callosum connection 526 + if self.callosum: 527 + self.callosum.stop() 528 + logger.info("Callosum connection stopped") 529 + 530 + logger.info("Shutdown complete") 189 531 190 532 191 533 async def async_main(args): 192 534 """Async entry point.""" 193 - observer = MacOSObserver(interval=args.interval) 535 + observer = MacOSObserver( 536 + interval=args.interval, 537 + sck_cli_path=args.sck_cli_path, 538 + ) 194 539 195 540 # Setup signal handlers 196 541 loop = asyncio.get_running_loop()

+184 -102

observe/macos/screencapture.py

··· 1 1 """ScreenCaptureKit integration via sck-cli subprocess. 2 2 3 3 This module manages the sck-cli subprocess lifecycle for video and audio capture 4 - on macOS using ScreenCaptureKit. 4 + on macOS using ScreenCaptureKit. sck-cli captures all displays simultaneously 5 + and outputs JSONL metadata to stdout with display geometry information. 5 6 """ 6 7 8 + import json 7 9 import logging 8 - import os 10 + import signal 9 11 import subprocess 12 + from dataclasses import dataclass 10 13 from pathlib import Path 11 14 from typing import Optional 15 + 16 + from observe.utils import assign_monitor_positions 12 17 13 18 logger = logging.getLogger(__name__) 14 19 15 20 21 + @dataclass 22 + class DisplayInfo: 23 + """Information about a single display's recording.""" 24 + 25 + display_id: int 26 + position: str 27 + x: int 28 + y: int 29 + width: int 30 + height: int 31 + temp_path: str 32 + 33 + def final_name(self, time_part: str, duration: int) -> str: 34 + """Generate the final filename for this display's video.""" 35 + return f"{time_part}_{duration}_{self.position}_{self.display_id}_screen.mov" 36 + 37 + 38 + @dataclass 39 + class AudioInfo: 40 + """Information about the audio recording.""" 41 + 42 + temp_path: str 43 + tracks: list[str] 44 + 45 + def final_name(self, time_part: str, duration: int) -> str: 46 + """Generate the final filename for audio.""" 47 + return f"{time_part}_{duration}_audio.m4a" 48 + 49 + 16 50 class ScreenCaptureKitManager: 17 51 """ 18 52 Manages sck-cli subprocess for synchronized video and audio capture. 19 53 20 54 Wraps the sck-cli tool to provide lifecycle management, handles process 21 - monitoring, and manages output file finalization with metadata. 55 + monitoring, parses JSONL output for display geometry, and manages output 56 + file finalization. 22 57 """ 23 58 24 59 def __init__(self, sck_cli_path: str = "sck-cli"): ··· 30 65 """ 31 66 self.sck_cli_path = sck_cli_path 32 67 self.process: Optional[subprocess.Popen] = None 33 - self.current_output_base: Optional[str] = None 68 + self.displays: list[DisplayInfo] = [] 69 + self.audio: Optional[AudioInfo] = None 70 + self.output_base: Optional[Path] = None 34 71 35 72 def start( 36 73 self, 37 74 output_base: Path, 38 75 duration: int, 39 76 frame_rate: float = 1.0, 40 - ) -> bool: 77 + ) -> tuple[list[DisplayInfo], Optional[AudioInfo]]: 41 78 """ 42 - Start video and audio capture to temporary files. 79 + Start video and audio capture. 43 80 44 81 Launches sck-cli as a subprocess with the specified parameters. 45 - Files are written to output_base.mov and output_base.m4a. 82 + Parses JSONL output from stdout to get display geometry information. 83 + Files are written to output_base_<displayID>.mov and output_base.m4a. 46 84 47 85 Args: 48 86 output_base: Base path for output files (without extension) ··· 50 88 frame_rate: Frame rate in Hz (default: 1.0) 51 89 52 90 Returns: 53 - True if subprocess started successfully, False otherwise 91 + Tuple of (list of DisplayInfo, AudioInfo or None) 92 + 93 + Raises: 94 + RuntimeError: If sck-cli fails to start or returns no displays 54 95 55 96 Example: 56 97 >>> manager = ScreenCaptureKitManager() 57 98 >>> day_dir = Path("journal/20250101") 58 99 >>> output_base = day_dir / ".120000" # Hidden temp file 59 - >>> manager.start(output_base, duration=300, frame_rate=1.0) 60 - True 100 + >>> displays, audio = manager.start(output_base, duration=300) 61 101 """ 62 - # TODO: Implement subprocess launch 63 - # Command: sck-cli <output_base> -r <frame_rate> -l <duration> 64 - # Store process handle in self.process 65 - # Store output_base in self.current_output_base 66 - # Check if sck-cli is available in PATH 67 - # Handle launch errors and log appropriately 68 - logger.warning("start() not yet implemented") 69 - return False 102 + self.output_base = output_base 70 103 71 - def stop(self) -> None: 72 - """ 73 - Stop the running capture gracefully. 104 + # Build command 105 + cmd = [ 106 + self.sck_cli_path, 107 + str(output_base), 108 + "-r", 109 + str(frame_rate), 110 + "-l", 111 + str(duration), 112 + ] 74 113 75 - Sends SIGTERM to the sck-cli process and waits for it to finish writing 76 - files properly. This ensures video and audio files are finalized correctly. 114 + logger.info(f"Starting sck-cli: {' '.join(cmd)}") 77 115 78 - Example: 79 - >>> manager.stop() 80 - # sck-cli receives SIGTERM and finishes writing files 81 - """ 82 - # TODO: Implement graceful shutdown 83 - # 1. Check if self.process exists and is running 84 - # 2. Send SIGTERM signal 85 - # 3. Wait with timeout (e.g., 5 seconds) for process to complete 86 - # 4. If timeout, send SIGKILL as fallback 87 - # 5. Clean up process handle 88 - logger.warning("stop() not yet implemented") 116 + try: 117 + self.process = subprocess.Popen( 118 + cmd, 119 + stdout=subprocess.PIPE, 120 + stderr=subprocess.PIPE, 121 + text=True, 122 + ) 123 + except FileNotFoundError: 124 + raise RuntimeError(f"sck-cli not found at: {self.sck_cli_path}") 125 + except Exception as e: 126 + raise RuntimeError(f"Failed to start sck-cli: {e}") 89 127 90 - def is_running(self) -> bool: 91 - """ 92 - Check if the capture subprocess is currently running. 128 + # Read JSONL metadata from stdout (sck-cli outputs this immediately) 129 + # Each line is either a display info or audio info 130 + displays_raw = [] 131 + audio_info = None 93 132 94 - Returns: 95 - True if subprocess is active, False otherwise 133 + # Read lines until we get all metadata (sck-cli outputs then starts capture) 134 + # We need to read non-blocking since the process keeps running 135 + try: 136 + for line in self.process.stdout: 137 + line = line.strip() 138 + if not line: 139 + continue 140 + try: 141 + data = json.loads(line) 142 + if data.get("type") == "display": 143 + displays_raw.append(data) 144 + elif data.get("type") == "audio": 145 + audio_info = data 146 + except json.JSONDecodeError: 147 + # Not JSON, might be a log message - ignore 148 + pass 96 149 97 - Example: 98 - >>> if manager.is_running(): 99 - ... print("Capture in progress") 100 - """ 101 - # TODO: Implement process status check 102 - # Check if self.process is not None and self.process.poll() is None 103 - return False 150 + # sck-cli outputs all metadata before starting capture 151 + # Once we have both displays and audio (or displays only if no audio) 152 + # we can stop reading. But we also need to not block forever. 153 + # Actually, sck-cli flushes stdout after metadata, so readline 154 + # will return empty when no more data. But process is still running. 155 + # We break after getting audio info or when stdout blocks. 156 + if audio_info is not None: 157 + break 158 + except Exception as e: 159 + logger.warning(f"Error reading sck-cli stdout: {e}") 104 160 105 - def finalize( 106 - self, 107 - temp_base: Path, 108 - final_video_path: Path, 109 - final_audio_path: Path, 110 - monitor_metadata: str, 111 - ) -> tuple[bool, bool]: 112 - """ 113 - Finalize capture files: add metadata and rename to final paths. 161 + if not displays_raw: 162 + self.stop() 163 + raise RuntimeError("sck-cli returned no display information") 114 164 115 - Takes temporary output files from sck-cli, adds monitor geometry metadata 116 - to the video file, and renames both files to their final destinations with 117 - duration in the filename. 165 + # Convert raw display data to monitor format for position assignment 166 + monitors = [] 167 + for d in displays_raw: 168 + x = int(d.get("x", 0)) 169 + y = int(d.get("y", 0)) 170 + width = int(d.get("width", 0)) 171 + height = int(d.get("height", 0)) 172 + monitors.append( 173 + { 174 + "id": str(d["displayID"]), 175 + "box": [x, y, x + width, y + height], 176 + "_raw": d, 177 + } 178 + ) 118 179 119 - Args: 120 - temp_base: Base path of temporary files (without extension) 121 - final_video_path: Final path for video file (HHMMSS_DURATION_screen.mov) 122 - final_audio_path: Final path for audio file (HHMMSS_DURATION_audio.m4a) 123 - monitor_metadata: Monitor geometry string to embed in video metadata 180 + # Assign position labels based on geometry 181 + monitors = assign_monitor_positions(monitors) 124 182 125 - Returns: 126 - Tuple of (video_success, audio_success) booleans 183 + # Build DisplayInfo objects 184 + self.displays = [] 185 + for mon in monitors: 186 + raw = mon["_raw"] 187 + self.displays.append( 188 + DisplayInfo( 189 + display_id=raw["displayID"], 190 + position=mon["position"], 191 + x=mon["box"][0], 192 + y=mon["box"][1], 193 + width=mon["box"][2] - mon["box"][0], 194 + height=mon["box"][3] - mon["box"][1], 195 + temp_path=raw["filename"], 196 + ) 197 + ) 198 + 199 + # Build AudioInfo if present 200 + if audio_info: 201 + tracks = [t["name"] for t in audio_info.get("tracks", [])] 202 + self.audio = AudioInfo( 203 + temp_path=audio_info["filename"], 204 + tracks=tracks, 205 + ) 206 + else: 207 + self.audio = None 208 + 209 + logger.info(f"sck-cli started with {len(self.displays)} display(s)") 210 + for display in self.displays: 211 + logger.info( 212 + f" Display {display.display_id}: {display.position} " 213 + f"({display.width}x{display.height}) -> {display.temp_path}" 214 + ) 215 + if self.audio: 216 + logger.info(f" Audio: {self.audio.temp_path} ({self.audio.tracks})") 127 217 128 - Example: 129 - >>> metadata = "0:center,0,0,1920,1080" 130 - >>> manager.finalize( 131 - ... Path("journal/20250101/.120000"), 132 - ... Path("journal/20250101/120000_300_screen.mov"), 133 - ... Path("journal/20250101/120000_300_audio.m4a"), 134 - ... metadata 135 - ... ) 136 - (True, True) 218 + return self.displays, self.audio 137 219 138 - Notes: 139 - - Uses ffmpeg or similar to update video metadata 140 - - Atomically renames files to avoid partial writes 141 - - Logs errors if files are missing or operations fail 220 + def stop(self) -> None: 142 221 """ 143 - # TODO: Implement file finalization 144 - # 1. Check if temp files exist (temp_base.mov, temp_base.m4a) 145 - # 2. Add monitor metadata to video title: 146 - # - Use ffmpeg: `ffmpeg -i input.mov -metadata title="..." -c copy output.mov` 147 - # - Or use PyObjC AVFoundation APIs to modify metadata 148 - # 3. Atomically rename temp_base.mov -> final_video_path 149 - # 4. Atomically rename temp_base.m4a -> final_audio_path 150 - # 5. Return success status for each file 151 - # 6. Handle errors gracefully (missing files, permission issues, etc.) 152 - logger.warning("finalize() not yet implemented") 153 - return False, False 222 + Stop the running capture gracefully. 154 223 155 - def get_output_size(self) -> int: 224 + Sends SIGTERM to the sck-cli process and waits for it to finish writing 225 + files properly. This ensures video and audio files are finalized correctly. 156 226 """ 157 - Get the current size of the video output file. 227 + if self.process is None: 228 + return 158 229 159 - Used for health checks to verify the file is growing during capture. 230 + if self.process.poll() is None: 231 + logger.info("Stopping sck-cli...") 232 + try: 233 + self.process.send_signal(signal.SIGTERM) 234 + try: 235 + self.process.wait(timeout=5) 236 + except subprocess.TimeoutExpired: 237 + logger.warning("sck-cli did not exit cleanly, killing") 238 + self.process.kill() 239 + self.process.wait() 240 + except Exception as e: 241 + logger.warning(f"Error stopping sck-cli: {e}") 242 + 243 + self.process = None 244 + 245 + def is_running(self) -> bool: 246 + """ 247 + Check if the capture subprocess is currently running. 160 248 161 249 Returns: 162 - File size in bytes, or 0 if file doesn't exist or not capturing 163 - 164 - Example: 165 - >>> size = manager.get_output_size() 166 - >>> print(f"Captured {size / 1024 / 1024:.1f} MB so far") 250 + True if subprocess is active, False otherwise 167 251 """ 168 - # TODO: Implement file size check 169 - # Check if self.current_output_base exists 170 - # Check if .mov file exists and return its size 171 - # Return 0 if not found or not capturing 172 - return 0 252 + if self.process is None: 253 + return False 254 + return self.process.poll() is None

+37

observe/observer.py

··· 1 + """Unified observer entry point with platform detection. 2 + 3 + Detects the current platform and delegates to the appropriate 4 + platform-specific observer implementation. 5 + """ 6 + 7 + import sys 8 + 9 + 10 + def main() -> None: 11 + """Platform-aware observer entry point. 12 + 13 + Detects the current platform and calls the appropriate observer: 14 + - macOS (darwin): observe.macos.observer 15 + - Linux: observe.gnome.observer 16 + 17 + All command-line arguments are passed through to the platform-specific 18 + implementation via its main() function. 19 + """ 20 + platform = sys.platform 21 + 22 + if platform == "darwin": 23 + from observe.macos.observer import main as platform_main 24 + elif platform == "linux": 25 + from observe.gnome.observer import main as platform_main 26 + else: 27 + print( 28 + f"Error: Observer not available for platform '{platform}'", file=sys.stderr 29 + ) 30 + print("Supported platforms: macOS (darwin), Linux", file=sys.stderr) 31 + sys.exit(1) 32 + 33 + platform_main() 34 + 35 + 36 + if __name__ == "__main__": 37 + main()

+13 -7

observe/utils.py

··· 147 147 148 148 def parse_screen_filename(filename: str) -> tuple[str, str]: 149 149 """ 150 - Parse position and connector from a per-monitor screen filename. 150 + Parse position and connector/displayID from a per-monitor screen filename. 151 151 152 152 Handles both pre-move filenames (with segment prefix) and post-move filenames 153 - (in segment directory without prefix). 153 + (in segment directory without prefix). Works with both GNOME connector IDs 154 + (e.g., "DP-3") and macOS displayIDs (e.g., "1"). 154 155 155 156 Parameters 156 157 ---------- 157 158 filename : str 158 159 Filename stem (without extension), e.g.: 159 - - "143022_300_center_DP-3_screen" (pre-move, in day root) 160 - - "center_DP-3_screen" (post-move, in segment directory) 160 + - "143022_300_center_DP-3_screen" (GNOME pre-move) 161 + - "143022_300_center_1_screen" (macOS pre-move) 162 + - "center_DP-3_screen" (GNOME post-move) 163 + - "center_1_screen" (macOS post-move) 161 164 162 165 Returns 163 166 ------- 164 167 tuple[str, str] 165 - (position, connector) tuple, e.g., ("center", "DP-3") 168 + (position, connector) tuple, e.g., ("center", "DP-3") or ("center", "1") 166 169 Returns ("unknown", "unknown") if pattern doesn't match 167 170 168 171 Examples 169 172 -------- 170 173 >>> parse_screen_filename("143022_300_center_DP-3_screen") 171 174 ("center", "DP-3") 175 + >>> parse_screen_filename("143022_300_center_1_screen") 176 + ("center", "1") 172 177 >>> parse_screen_filename("center_DP-3_screen") 173 178 ("center", "DP-3") 179 + >>> parse_screen_filename("center_1_screen") 180 + ("center", "1") 174 181 >>> parse_screen_filename("143022_300_screen") 175 182 ("unknown", "unknown") 176 - >>> parse_screen_filename("screen") 177 - ("unknown", "unknown") 178 183 """ 179 184 # Pattern 1: HHMMSS_LEN_position_connector_screen (pre-move) 185 + # Connector can be alphanumeric with hyphens (GNOME: DP-3) or just numeric (macOS: 1) 180 186 match = re.match(r"^\d{6}_\d+_([a-z-]+)_([A-Za-z0-9-]+)_screen$", filename) 181 187 if match: 182 188 return match.group(1), match.group(2)

+2 -3

pyproject.toml

··· 64 64 full = [ 65 65 "soundcard", 66 66 "soundfile", 67 - "pyannote.audio", 68 - "deepfilternet", 69 - "torch", 67 + "pyannote.audio>=4,<5", 70 68 "scipy", 71 69 "opencv-python", 72 70 "PyGObject", ··· 113 111 muse-cortex = "muse.cortex:main" 114 112 think-callosum = "think.callosum:main" 115 113 think-messages = "think.messages:main" 114 + observer = "observe.observer:main" 116 115 observe-gnome = "observe.gnome.observer:main" 117 116 observe-macos = "observe.macos.observer:main" 118 117 observe-transcribe = "observe.transcribe:main"

+43 -38

tests/test_cortex_client.py

··· 2 2 3 3 import json 4 4 import os 5 + import shutil 6 + import tempfile 5 7 import threading 6 8 import time 7 9 from pathlib import Path ··· 14 16 get_agent_status, 15 17 get_agent_thread, 16 18 ) 17 - from think.callosum import CallosumServer 19 + from think.callosum import CallosumConnection, CallosumServer 18 20 from think.models import GPT_5 19 21 20 22 21 23 @pytest.fixture 22 - def callosum_server(tmp_path, monkeypatch): 23 - """Start a Callosum server for testing.""" 24 + def callosum_server(monkeypatch): 25 + """Start a Callosum server for testing. 26 + 27 + Uses a short temp path in /tmp to avoid Unix socket path length limits 28 + (~104 chars on macOS). pytest's tmp_path creates paths that are too long. 29 + """ 30 + # Create short temp dir to avoid Unix socket path length limits 31 + tmp_dir = tempfile.mkdtemp(dir="/tmp", prefix="callosum_") 32 + tmp_path = Path(tmp_dir) 33 + 24 34 monkeypatch.setenv("JOURNAL_PATH", str(tmp_path)) 25 35 (tmp_path / "agents").mkdir(parents=True, exist_ok=True) 26 36 ··· 37 47 else: 38 48 pytest.fail("Callosum server did not start in time") 39 49 40 - yield server 50 + yield tmp_path 41 51 42 52 server.stop() 43 53 server_thread.join(timeout=2) 54 + shutil.rmtree(tmp_dir, ignore_errors=True) 44 55 45 56 46 - def test_cortex_request_broadcasts_to_callosum(tmp_path, monkeypatch, callosum_server): 47 - """Test that cortex_request broadcasts request to Callosum.""" 48 - monkeypatch.setenv("JOURNAL_PATH", str(tmp_path)) 57 + @pytest.fixture 58 + def callosum_listener(callosum_server): 59 + """Provide a CallosumConnection listener that collects received messages. 49 60 50 - # Listen for broadcasts 51 - received_messages = [] 61 + Yields (messages, listener) where messages is a list that accumulates 62 + all broadcast messages received during the test. 63 + """ 64 + messages = [] 52 65 53 66 def callback(msg): 54 - received_messages.append(msg) 55 - 56 - from think.callosum import CallosumConnection 67 + messages.append(msg) 57 68 58 69 listener = CallosumConnection() 59 70 listener.start(callback=callback) 71 + time.sleep(0.1) # Allow connection to establish 72 + 73 + yield messages 74 + 75 + listener.stop() 76 + 60 77 61 - time.sleep(0.1) 78 + def test_cortex_request_broadcasts_to_callosum(callosum_listener): 79 + """Test that cortex_request broadcasts request to Callosum.""" 80 + messages = callosum_listener 62 81 63 82 # Create a request 64 83 agent_id = cortex_request( ··· 71 90 time.sleep(0.2) 72 91 73 92 # Verify broadcast was received 74 - assert len(received_messages) == 1 75 - msg = received_messages[0] 93 + assert len(messages) == 1 94 + msg = messages[0] 76 95 assert msg["tract"] == "cortex" 77 96 assert msg["event"] == "request" 78 97 assert msg["prompt"] == "Test prompt" ··· 81 100 assert msg["model"] == GPT_5 82 101 assert msg["agent_id"] == agent_id 83 102 assert "ts" in msg 84 - 85 - listener.stop() 86 103 87 104 88 - def test_cortex_request_returns_agent_id(tmp_path, monkeypatch, callosum_server): 105 + def test_cortex_request_returns_agent_id(callosum_server): 89 106 """Test that cortex_request returns agent_id string.""" 90 - monkeypatch.setenv("JOURNAL_PATH", str(tmp_path)) 107 + _ = callosum_server # Needed for side effects only 91 108 92 109 agent_id = cortex_request(prompt="Test", persona="default", backend="openai") 93 110 ··· 97 114 assert len(agent_id) == 13 # Millisecond timestamp 98 115 99 116 100 - def test_cortex_request_with_handoff(tmp_path, monkeypatch, callosum_server): 117 + def test_cortex_request_with_handoff(callosum_listener): 101 118 """Test cortex_request with handoff_from parameter.""" 102 - monkeypatch.setenv("JOURNAL_PATH", str(tmp_path)) 103 - 104 - received_messages = [] 105 - 106 - def callback(msg): 107 - received_messages.append(msg) 108 - 109 - from think.callosum import CallosumConnection 110 - 111 - listener = CallosumConnection() 112 - listener.start(callback=callback) 113 - time.sleep(0.1) 119 + messages = callosum_listener 114 120 115 - agent_id = cortex_request( 121 + cortex_request( 116 122 prompt="Continue analysis", 117 123 persona="reviewer", 118 124 backend="anthropic", ··· 121 127 122 128 time.sleep(0.2) 123 129 124 - msg = received_messages[0] 130 + msg = messages[0] 125 131 assert msg["handoff_from"] == "1234567890000" 126 132 assert msg["persona"] == "reviewer" 127 - 128 - listener.stop() 129 133 130 134 131 - def test_cortex_request_unique_agent_ids(tmp_path, monkeypatch, callosum_server): 135 + def test_cortex_request_unique_agent_ids(callosum_server): 132 136 """Test that cortex_request generates unique agent IDs.""" 133 - monkeypatch.setenv("JOURNAL_PATH", str(tmp_path)) 137 + _ = callosum_server # Needed for side effects only 134 138 135 139 agent_ids = [] 136 140 for i in range(3): ··· 146 150 147 151 def test_cortex_request_no_journal_path(callosum_server): 148 152 """Test cortex_request fails without JOURNAL_PATH.""" 153 + _ = callosum_server # Needed for side effects only 149 154 old_path = os.environ.pop("JOURNAL_PATH", None) 150 155 try: 151 156 with pytest.raises(

+1 -1

tests/test_supervisor.py

··· 170 170 171 171 procs = mod.start_observers() 172 172 assert len(procs) == 2 173 - assert any(cmd == ["observe-gnome", "-v"] for cmd, _, _ in started) 173 + assert any(cmd == ["observer", "-v"] for cmd, _, _ in started) 174 174 assert any(cmd == ["observe-sense", "-v"] for cmd, _, _ in started) 175 175 # Check that stdout and stderr capture pipes 176 176 for cmd, stdout, stderr in started:

+4 -4

think/runner.py

··· 209 209 RuntimeError: If process fails to spawn 210 210 211 211 Example: 212 - managed = ManagedProcess.spawn(["observe-gnome", "-v"]) 213 - # Logs to: {JOURNAL}/{YYYYMMDD}/health/{ref}_observe-gnome.log 214 - # Symlinks: {YYYYMMDD}/health/observe-gnome.log (day-level) 215 - # health/observe-gnome.log (journal-level) 212 + managed = ManagedProcess.spawn(["observer", "-v"]) 213 + # Logs to: {JOURNAL}/{YYYYMMDD}/health/{ref}_observer.log 214 + # Symlinks: {YYYYMMDD}/health/observer.log (day-level) 215 + # health/observer.log (journal-level) 216 216 217 217 # With explicit correlation ID: 218 218 managed = ManagedProcess.spawn(

+3 -3

think/supervisor.py

··· 749 749 750 750 751 751 def start_observers() -> list[ManagedProcess]: 752 - """Launch observe-gnome and observe-sense with output logging.""" 752 + """Launch observer (platform-detected) and observe-sense with output logging.""" 753 753 procs: list[ManagedProcess] = [] 754 754 commands = { 755 - "observer": ["observe-gnome", "-v"], 755 + "observer": ["observer", "-v"], 756 756 "sense": ["observe-sense", "-v"], 757 757 } 758 758 for name, cmd in commands.items(): ··· 1140 1140 parser.add_argument( 1141 1141 "--no-observers", 1142 1142 action="store_true", 1143 - help="Do not automatically start observe-gnome and observe-sense", 1143 + help="Do not automatically start observer and observe-sense", 1144 1144 ) 1145 1145 parser.add_argument( 1146 1146 "--no-daily",

Configure Feed

Configure Feed