personal memory agent
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Fix macOS observer stdout parsing and add real-time sck-cli logging

- Fix select() loop to continue polling instead of breaking on no data
- Add background threads to stream sck-cli stdout/stderr in real-time
- Make status event structure compatible with Linux observer
- Fix PyAV container leak with try/finally pattern
- Fix type hint for get_timestamp_parts (float | None)
- Update docstrings and TODO.md to reflect current state
- Remove unused output_base instance variable

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

+143 -200
+21 -170
observe/macos/TODO.md
··· 1 - # macOS Observer Implementation TODO 2 - 3 - This document tracks the remaining work to complete the macOS observer integration using sck-cli and ScreenCaptureKit. 4 - 5 - ## Phase 1: Activity Detection (activity.py) - DONE 6 - 7 - ### 1.1 Implement `get_idle_time_ms()` (DONE) 8 - - [x] Import PyObjC Quartz framework 9 - - [x] Use `CGEventSourceSecondsSinceLastEventType(1, kCGAnyInputEventType)` 10 - - [x] Convert seconds to milliseconds 11 - - [x] Add error handling for API failures 12 - - [x] Test on macOS system 13 - 14 - ### 1.2 Implement `is_screen_locked()` (DONE) 15 - - [x] Used CGSessionCopyCurrentDictionary for kCGSSessionOnConsoleKey 16 - - [x] Add error handling 17 - - [x] Test on macOS system 18 - 19 - ### 1.3 Implement `is_power_save_active()` (DONE) 20 - - [x] Used CGDisplayIsAsleep(CGMainDisplayID()) 21 - - [x] Add error handling 22 - - [x] Test on macOS system 23 - 24 - ### 1.4 Implement `is_output_muted()` (DONE) 25 - - [x] Used osascript to query volume settings 26 - - [x] Add error handling and timeout 27 - - [x] Test on macOS system 28 - 29 - ## Phase 2: ScreenCaptureKit Manager (screencapture.py) 30 - 31 - **Note:** sck-cli now provides multi-display capture with JSONL metadata output to stdout. 32 - Display geometry is parsed from sck-cli output - no PyObjC monitor detection needed. 33 - 34 - ### 2.1 JSONL Parsing (DONE) 35 - - [x] Parse sck-cli stdout for display geometry 36 - - [x] Extract displayID, x, y, width, height per display 37 - - [x] Use `assign_monitor_positions()` to compute position labels 38 - - [x] Build DisplayInfo objects with position, displayID, temp_path 39 - 40 - ### 2.2 Implement `start()` (DONE) 41 - - [x] Build command with frame rate and duration 42 - - [x] Launch subprocess and capture stdout 43 - - [x] Parse JSONL for display and audio info 44 - - [x] Return list of DisplayInfo and AudioInfo 45 - 46 - ### 2.3 Implement `stop()` (DONE) 47 - - [x] Send SIGTERM to process 48 - - [x] Wait with timeout for graceful shutdown 49 - - [x] SIGKILL as fallback 50 - 51 - ### 2.4 Implement `finalize()` (DONE) 52 - - [x] Simple file rename (no metadata embedding needed) 53 - - [x] Rename per-display: `temp_displayID.mov` -> `HHMMSS_LEN_position_displayID_screen.mov` 54 - - [x] Rename audio: `temp.m4a` -> `HHMMSS_LEN_audio.m4a` 55 - 56 - ### 2.5 Implement `get_output_size()` (DONE) 57 - - [x] Sum sizes of all display video files 58 - - [x] Used for health check file growth verification 59 - 60 - ## Phase 3: Main Observer (observer.py) - DONE 61 - 62 - ### 3.1 Implement `setup()` (DONE) 63 - - [x] Verify sck-cli is available in PATH via shutil.which() 64 - - [x] Initialize Callosum connection 65 - - [x] Start Callosum connection 66 - - [x] Log initialization success 67 - - [x] Return True on success, False on failure 1 + # macOS Observer TODO 68 2 69 - ### 3.2 Implement `check_activity_status()` (DONE) 70 - - [x] Call `get_idle_time_ms()` from activity module 71 - - [x] Call `is_screen_locked()` from activity module 72 - - [x] Call `is_output_muted()` from activity module 73 - - [x] Cache values in instance variables for status events 74 - - [x] Determine if idle: `(idle_time > IDLE_THRESHOLD_MS) or screen_locked` 75 - - [x] Return activity status 3 + Tracks remaining work for the macOS observer integration. 76 4 77 - ### 3.3 Implement `handle_boundary()` (DONE) 78 - - [x] Get timestamp parts and calculate duration 79 - - [x] Stop capture if running 80 - - [x] Check audio threshold (3-chunk RMS logic) before saving audio 81 - - [x] Build finalization list and queue 82 - - [x] Reset timing for new window 83 - - [x] Start new capture if active and screen not locked 84 - - [x] Emit Callosum observing event with saved files 5 + ## Completed 85 6 86 - ### 3.4 Implement `initialize_capture()` (DONE) 87 - - [x] Get timestamp for filename 88 - - [x] Build temp output base (hidden file) 89 - - [x] Start sck-cli via ScreenCaptureKitManager 90 - - [x] Store displays and audio info 91 - - [x] Initialize file size tracking 92 - - [x] Log capture start with display info 93 - 94 - ### 3.5 Implement `emit_status()` (DONE) 95 - - [x] Build capture info dict with recording status, displays, elapsed time, files_growing 96 - - [x] Build activity info dict with active, idle_time_ms, screen_locked, output_muted 97 - - [x] Emit via Callosum 98 - 99 - ### 3.6 Implement `finalize_screencast()` (DONE) 100 - - [x] Simple file rename using os.replace() 101 - - [x] Log success/failure 102 - 103 - ### 3.7 Implement `main_loop()` (DONE) 104 - - [x] Check initial activity status 105 - - [x] Start initial capture if active 106 - - [x] Main loop with CHUNK_DURATION sleep intervals 107 - - [x] Process pending finalizations 108 - - [x] Check activity status and detect activation edge 109 - - [x] Detect mute state transitions (triggers boundary like GNOME) 110 - - [x] Handle window boundaries 111 - - [x] Track file growth for health reporting 112 - - [x] Emit status events 113 - 114 - ### 3.8 Implement `shutdown()` (DONE) 115 - - [x] Stop capture if running 116 - - [x] Check audio threshold for final segment 117 - - [x] Finalize all pending captures 118 - - [x] Stop Callosum connection 119 - 120 - ### 3.9 Implement `_check_audio_threshold()` (DONE) 121 - - [x] Decode m4a with PyAV 122 - - [x] Split into 5-second chunks 123 - - [x] Compute RMS per chunk 124 - - [x] Count threshold hits (same MIN_HITS_FOR_SAVE = 3 as GNOME) 125 - - [x] Return True if enough voice activity 126 - 127 - ### 3.10 Wire up CLI arguments (DONE) 128 - - [x] Pass --sck-cli-path to ScreenCaptureKitManager 7 + - **Phase 1: Activity Detection** (`activity.py`) - All done 8 + - **Phase 2: ScreenCaptureKit Manager** (`screencapture.py`) - All done 9 + - **Phase 3: Main Observer** (`observer.py`) - All done 10 + - **Phase 5: sck-cli** - All requirements met 129 11 130 12 ## Phase 4: Testing & Integration 131 13 ··· 161 43 - [ ] Test parse_screen_filename() with new displayID format 162 44 - [ ] Verify think-indexer handles new file formats 163 45 164 - ## Phase 5: sck-cli (DONE) 165 - 166 - All sck-cli requirements are met: 167 - - [x] Multi-display capture with per-display files 168 - - [x] JSONL metadata output to stdout 169 - - [x] Temp file support (Python passes hidden path like `.HHMMSS`) 170 - - [x] Graceful SIGTERM/SIGINT handling (verified) 171 - - [x] File validation done in Python's `finalize()` 172 - 173 46 ## Phase 6: Documentation & Polish 174 47 175 48 ### 6.1 Documentation 176 - - [ ] Add docstring examples to all public functions 177 - - [ ] Create observe/macos/README.md with: 178 - - Installation instructions (PyObjC, sck-cli) 179 - - Usage examples 180 - - Configuration options 181 - - Troubleshooting guide 49 + - [ ] Create observe/macos/README.md with installation and usage 182 50 - [ ] Update main README.md to mention macOS support 183 51 - [ ] Document differences from Linux observer 184 52 185 53 ### 6.2 Code Quality 186 - - [ ] Run `make format` to format all new code 187 - - [ ] Run `make lint` and fix any issues 188 - - [ ] Add type hints to all function signatures 189 - - [ ] Add logging at appropriate levels (INFO, DEBUG, WARNING, ERROR) 54 + - [x] Run `make format` and `make lint` 55 + - [x] Add type hints to function signatures 56 + - [x] Proper logging at appropriate levels 190 57 191 - ### 6.3 Error Handling 192 - - [ ] Review all TODO implementations for error handling 193 - - [ ] Add try/except blocks where needed 194 - - [ ] Ensure errors are logged with context 195 - - [ ] Ensure errors don't crash the observer (graceful degradation) 58 + --- 196 59 197 - ## Notes 198 - 199 - ### Architecture Changes from Original Plan 200 - - **No PyObjC monitor detection needed**: sck-cli provides display geometry via JSONL stdout 201 - - **No metadata embedding**: Position/displayID encoded in filename instead 202 - - **Multi-display from day one**: sck-cli captures all displays automatically 203 - - **DisplayInfo dataclass**: Mirrors GNOME's StreamInfo pattern 60 + ## Reference 204 61 205 62 ### File Naming Convention 206 63 - **Video**: `HHMMSS_LEN_position_displayID_screen.mov` (e.g., `120000_300_center_1_screen.mov`) 207 64 - **Audio**: `HHMMSS_LEN_audio.m4a` (e.g., `120000_300_audio.m4a`) 208 65 - **Temp files**: `.HHMMSS_displayID.mov`, `.HHMMSS.m4a` (hidden during capture) 209 66 210 - ### Differences from GNOME Observer 211 - - **Audio**: sck-cli provides synchronized .m4a instead of separate AudioRecorder 212 - - **Format**: .mov video instead of .webm 213 - - **Activity APIs**: PyObjC instead of DBus 214 - - **Subprocess**: Manages external sck-cli process instead of direct API calls 215 - - **Connector ID**: Uses numeric displayID instead of connector names like "DP-3" 216 - - **No RMS threshold**: Audio always captured when recording 67 + ### Differences from Linux Observer 68 + - **Audio threshold**: macOS checks at boundary (post-capture), Linux checks real-time 69 + - **Format**: .mov video instead of .webm, .m4a audio instead of .flac 70 + - **Activity APIs**: PyObjC/Quartz instead of DBus 71 + - **Capture**: External sck-cli process instead of GStreamer/PipeWire 72 + - **Connector ID**: Numeric displayID instead of connector names like "DP-3" 73 + - **No tmux mode**: macOS observer only has screencast/idle modes 217 74 218 75 ### Dependencies 219 76 - sck-cli must be built and available in PATH (or specified via --sck-cli-path) 220 - - PyObjC frameworks required: core, Cocoa, Quartz (for activity detection only) 77 + - PyObjC frameworks required: core, Cocoa, Quartz (for activity detection) 221 78 - observe.utils.assign_monitor_positions for position label computation 222 - 223 - ### Testing Strategy 224 - 1. Start with activity.py (testable independently) 225 - 2. Then screencapture.py (can test with mock sck-cli or real capture) 226 - 3. Then observer.py (integration testing) 227 - 4. Finally sck-cli enhancements (separate repo)
+1 -1
observe/macos/__init__.py
··· 1 1 # SPDX-License-Identifier: AGPL-3.0-only 2 2 # Copyright (c) 2026 sol pbc 3 3 4 - """macOS-specific observation utilities using ScreenCaptureKit.""" 4 + """macOS-specific observation: activity detection (PyObjC/Quartz) and capture (sck-cli)."""
+2 -2
observe/macos/activity.py
··· 3 3 4 4 """macOS system activity detection using PyObjC. 5 5 6 - This module mirrors the GNOME dbus.py structure, providing activity detection 7 - primitives using native macOS APIs via PyObjC. 6 + This module mirrors the observe/gnome/activity.py structure, providing activity 7 + detection primitives using native macOS APIs via PyObjC. 8 8 """ 9 9 10 10 import logging
+37 -13
observe/macos/observer.py
··· 124 124 125 125 return is_active 126 126 127 - def get_timestamp_parts(self, timestamp: float = None) -> tuple[str, str]: 127 + def get_timestamp_parts(self, timestamp: float | None = None) -> tuple[str, str]: 128 128 """ 129 129 Get date and time parts from timestamp. 130 130 ··· 158 158 logger.warning(f"Audio file not found for threshold check: {audio_path}") 159 159 return False 160 160 161 + container = None 161 162 try: 162 163 container = av.open(audio_path) 163 164 audio_streams = list(container.streams.audio) 164 165 165 166 if not audio_streams: 166 - container.close() 167 167 logger.warning(f"No audio streams in {audio_path}") 168 168 return False 169 169 ··· 179 179 arr = arr.mean(axis=0) 180 180 samples.append(arr.flatten()) 181 181 182 - container.close() 183 - 184 182 if not samples: 185 183 logger.warning(f"No audio samples decoded from {audio_path}") 186 184 return False ··· 211 209 logger.warning(f"Error checking audio threshold for {audio_path}: {e}") 212 210 # On error, keep the file (safer default) 213 211 return True 212 + finally: 213 + if container is not None: 214 + container.close() 214 215 215 216 def handle_boundary(self, is_active: bool): 216 217 """ ··· 326 327 return True 327 328 328 329 def emit_status(self): 329 - """Emit observe.status event with current state.""" 330 + """Emit observe.status event with current state. 331 + 332 + Event structure matches Linux observer for compatibility: 333 + - mode: "screencast" or "idle" (macOS doesn't have tmux mode) 334 + - screencast: recording status and display info 335 + - tmux: always empty (not supported on macOS) 336 + - audio: always empty (macOS checks threshold at boundary, not real-time) 337 + - activity: system activity status 338 + """ 330 339 if not self.callosum: 331 340 return 332 341 333 342 journal_path = os.getenv("JOURNAL_PATH", "") 334 343 335 - # Build capture info 344 + # Determine mode (macOS is binary: screencast or idle) 345 + mode = "screencast" if self.capture_running else "idle" 346 + 347 + # Build screencast info (matches Linux observer structure) 336 348 if self.capture_running and self.current_displays: 337 349 elapsed = int(time.monotonic() - self.start_at_mono) 338 - displays_info = [] 350 + streams_info = [] 339 351 for display in self.current_displays: 340 352 try: 341 353 rel_file = ( ··· 346 358 except ValueError: 347 359 rel_file = display.temp_path 348 360 349 - displays_info.append( 361 + streams_info.append( 350 362 { 351 363 "position": display.position, 352 - "display_id": display.display_id, 364 + "connector": str(display.display_id), 353 365 "file": rel_file, 354 366 } 355 367 ) 356 368 357 - capture_info = { 369 + screencast_info = { 358 370 "recording": True, 359 - "displays": displays_info, 371 + "streams": streams_info, 360 372 "window_elapsed_seconds": elapsed, 361 373 "files_growing": self.files_growing, 362 374 } 363 375 else: 364 - capture_info = {"recording": False, "files_growing": False} 376 + screencast_info = {"recording": False, "files_growing": False} 377 + 378 + # Tmux info (not supported on macOS) 379 + tmux_info = {"capturing": False} 380 + 381 + # Audio info (macOS checks threshold at boundary, not real-time) 382 + audio_info = { 383 + "threshold_hits": 0, 384 + "will_save": False, 385 + } 365 386 366 387 # Activity info 367 388 activity_info = { ··· 375 396 self.callosum.emit( 376 397 "observe", 377 398 "status", 378 - capture=capture_info, 399 + mode=mode, 400 + screencast=screencast_info, 401 + tmux=tmux_info, 402 + audio=audio_info, 379 403 activity=activity_info, 380 404 ) 381 405
+82 -14
observe/macos/screencapture.py
··· 10 10 11 11 import json 12 12 import logging 13 + import select 13 14 import signal 14 15 import subprocess 16 + import threading 17 + import time 15 18 from dataclasses import dataclass 16 19 from pathlib import Path 17 20 from typing import Optional 18 21 19 22 from observe.utils import assign_monitor_positions 23 + 24 + # Timeout for reading metadata from sck-cli (seconds) 25 + METADATA_TIMEOUT = 5.0 20 26 21 27 logger = logging.getLogger(__name__) 22 28 ··· 70 76 self.process: Optional[subprocess.Popen] = None 71 77 self.displays: list[DisplayInfo] = [] 72 78 self.audio: Optional[AudioInfo] = None 73 - self.output_base: Optional[Path] = None 79 + self._output_threads: list[threading.Thread] = [] 74 80 75 81 def start( 76 82 self, ··· 102 108 >>> output_base = day_dir / ".120000" # Hidden temp file 103 109 >>> displays, audio = manager.start(output_base, duration=300) 104 110 """ 105 - self.output_base = output_base 106 - 107 111 # Build command 108 112 cmd = [ 109 113 self.sck_cli_path, ··· 122 126 stdout=subprocess.PIPE, 123 127 stderr=subprocess.PIPE, 124 128 text=True, 129 + bufsize=1, # Line buffering for real-time output 125 130 ) 126 131 except FileNotFoundError: 127 132 raise RuntimeError(f"sck-cli not found at: {self.sck_cli_path}") ··· 133 138 displays_raw = [] 134 139 audio_info = None 135 140 136 - # Read lines until we get all metadata (sck-cli outputs then starts capture) 137 - # We need to read non-blocking since the process keeps running 141 + # Read lines until we get both display and audio metadata. 142 + # Use select() with timeout to avoid blocking forever - the process 143 + # keeps running for the capture duration but outputs metadata upfront. 144 + # Note: "for line in file:" uses block buffering which can hang. 145 + deadline = time.monotonic() + METADATA_TIMEOUT 146 + stdout_fd = self.process.stdout.fileno() 147 + 138 148 try: 139 - for line in self.process.stdout: 149 + while time.monotonic() < deadline: 150 + # Wait for data with remaining timeout 151 + remaining = deadline - time.monotonic() 152 + if remaining <= 0: 153 + break 154 + 155 + readable, _, _ = select.select([stdout_fd], [], [], min(remaining, 1.0)) 156 + if not readable: 157 + # No data yet, keep polling until deadline 158 + continue 159 + 160 + line = self.process.stdout.readline() 161 + if not line: 162 + # EOF - process closed stdout 163 + break 164 + 140 165 line = line.strip() 141 166 if not line: 142 167 continue 168 + 143 169 try: 144 170 data = json.loads(line) 145 171 if data.get("type") == "display": ··· 147 173 elif data.get("type") == "audio": 148 174 audio_info = data 149 175 except json.JSONDecodeError: 150 - # Not JSON, might be a log message - ignore 176 + # Not JSON, just a log message (already logged above) 151 177 pass 152 178 153 - # sck-cli outputs all metadata before starting capture 154 - # Once we have both displays and audio (or displays only if no audio) 155 - # we can stop reading. But we also need to not block forever. 156 - # Actually, sck-cli flushes stdout after metadata, so readline 157 - # will return empty when no more data. But process is still running. 158 - # We break after getting audio info or when stdout blocks. 159 - if audio_info is not None: 179 + # Break once we have both display and audio info 180 + if displays_raw and audio_info is not None: 160 181 break 161 182 except Exception as e: 162 183 logger.warning(f"Error reading sck-cli stdout: {e}") ··· 218 239 if self.audio: 219 240 logger.info(f" Audio: {self.audio.temp_path} ({self.audio.tracks})") 220 241 242 + # Start background threads to log remaining stdout/stderr in real-time 243 + self._output_threads = [ 244 + threading.Thread( 245 + target=self._stream_stdout, 246 + daemon=True, 247 + name="sck-cli-stdout", 248 + ), 249 + threading.Thread( 250 + target=self._stream_stderr, 251 + daemon=True, 252 + name="sck-cli-stderr", 253 + ), 254 + ] 255 + for thread in self._output_threads: 256 + thread.start() 257 + 221 258 return self.displays, self.audio 222 259 223 260 def stop(self) -> None: ··· 243 280 except Exception as e: 244 281 logger.warning(f"Error stopping sck-cli: {e}") 245 282 283 + # Wait for output threads to finish (they exit when pipes close) 284 + for thread in self._output_threads: 285 + thread.join(timeout=1) 286 + self._output_threads = [] 287 + 246 288 self.process = None 289 + 290 + def _stream_stdout(self) -> None: 291 + """Background thread: stream remaining stdout lines to logger.""" 292 + if self.process is None or self.process.stdout is None: 293 + return 294 + 295 + try: 296 + for line in self.process.stdout: 297 + line = line.strip() 298 + if line: 299 + logger.info(f"sck-cli: {line}") 300 + except Exception as e: 301 + logger.debug(f"Error reading sck-cli stdout: {e}") 302 + 303 + def _stream_stderr(self) -> None: 304 + """Background thread: stream stderr lines to logger.""" 305 + if self.process is None or self.process.stderr is None: 306 + return 307 + 308 + try: 309 + for line in self.process.stderr: 310 + line = line.strip() 311 + if line: 312 + logger.info(f"sck-cli stderr: {line}") 313 + except Exception as e: 314 + logger.debug(f"Error reading sck-cli stderr: {e}") 247 315 248 316 def is_running(self) -> bool: 249 317 """