personal memory agent
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: inventory all destructive removal sites

Catalog every non-test, non-atomic-tmp destructive removal in
production code, classified against think/retention.py as the
reference model.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

+160
+160
docs/deletion-sites-inventory.md
··· 1 + # Removal Sites Inventory 2 + 3 + Inventory of every non-test, non-scratch, non-atomic-tmp destructive removal (`shutil.rmtree`, `Path.unlink`, `os.remove`, `os.unlink`) in production code. 4 + 5 + > Atomic-tmp exclusion heuristic: 6 + > same-directory temp sibling created inside the same function for atomic replacement of one target file, promoted via `os.replace`/`rename`, with `unlink` only in the exception cleanup branch. Do not exclude directory deletes, named domain paths, or rollback deletes of non-temp targets. 7 + 8 + ## Methodology 9 + 10 + - Scope: every non-test, non-scratch, non-atomic-tmp destructive removal (`shutil.rmtree`, `Path.unlink`, `os.remove`, `os.unlink`) in production code. 11 + - Grep command: `rg -n 'shutil\.rmtree|\.unlink\(|os\.remove|os\.unlink' --type py` 12 + - Exclusion filter: `tests/`, `scratch/`, `.venv/`, `tmp/`, `observers/` 13 + - Atomic-tmp exclusion heuristic: 14 + 15 + > same-directory temp sibling created inside the same function for atomic replacement of one target file, promoted via `os.replace`/`rename`, with `unlink` only in the exception cleanup branch. Do not exclude directory deletes, named domain paths, or rollback deletes of non-temp targets. 16 + 17 + - Reference model: `think/retention.py` 18 + - scope-narrow docstring at `:4-19` 19 + - completion check at `:73-115` 20 + - per-file stream-hashed SHA-256 at `:416-422` 21 + - dry-run support at `:349-369`, `:427-429`, `:450-451` 22 + - narrow exception handling at `:378-381`, `:416-429` 23 + - retention log at `:456-472` 24 + - Write-owner table pointer: `CLAUDE.md` / `AGENTS.md` §7 L2 25 + - Importer convention: importers audit destructive operations via `log_app_action(app='import', ...)` per repo convention (`think/importers/journal_source_cli.py:40, 75, 230, 250`). 26 + - Raw grep noise removed manually: nested app test hit at `apps/observer/tests/test_routes.py:1008` and regex literals in `scripts/check_layer_hygiene.py:54-55`. 27 + 28 + ## Classification Legend 29 + 30 + - `✅` matches the retention reference model closely enough to serve as the template. 31 + - `⚠️` has partial safety coverage or is intentionally out of scope for this sweep. 32 + - `❌` remains a destructive gap after applying the exclusion heuristic. 33 + 34 + ## think/retention (reference) 35 + 36 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 37 + | --- | --- | --- | --- | --- | --- | --- | --- | 38 + | `think/retention.py:428` | raw media files in completed segments | retention purge on eligible segments | `is_segment_complete()` plus retention-policy eligibility | `_write_retention_log()` to `health/retention.log` | yes | `✅` | reference template for this sweep | 39 + 40 + ## think/entities 41 + 42 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 43 + | --- | --- | --- | --- | --- | --- | --- | --- | 44 + | `think/entities/journal.py:369,375` | `facets/*/entities/<id>/` rel dirs and `entities/<id>/` | `delete_journal_entity()` | entity must exist, must not be principal, and each target must exist as a directory | yes (route: `apps/entities/routes.py:910-918`) | no | `⚠️` | helper itself is unaudited, but the production route is audited; deferred follow-up | 45 + | `think/entities/merge.py:520,536,697,702` | target rel dir overwrite, source rel dir, discovery cache, source entity dir | `merge_entity(..., commit=True)` | source/target entities are loaded and validated up front; delete paths come from the merge plan plus `exists()` checks | yes (`think/entities/merge.py:587-611,706-714`) | yes (`commit=False`) | `⚠️` | audited commit flow, but the broader merge bundle is intentionally deferred | 46 + 47 + ## think/importers 48 + 49 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 50 + | --- | --- | --- | --- | --- | --- | --- | --- | 51 + | `think/importers/shared.py:361` | existing `imports/<timestamp>/` directory | `_setup_import(..., force=True)` | fixed `journal/imports/<timestamp>` path and `import_dir.exists()` gate | yes (`think/importers/shared.py:351-357`) | yes | `✅` | fixed in this sweep: per-file manifest is hashed and logged before `rmtree` | 52 + | `think/importers/plaud.py:196` | temporary download file | Plaud download write failure | exact `NamedTemporaryFile` path created in the same function | no | no | `⚠️` | temp download cleanup, not a journal-domain delete | 53 + 54 + ## think/facets 55 + 56 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 57 + | --- | --- | --- | --- | --- | --- | --- | --- | 58 + | `think/facets.py:907` | `facets/<name>/` directory | `delete_facet()` | facet path resolves under `journal/facets`, with existing-facet checks before delete | yes (`think/facets.py:899-906`) | no | `⚠️` | audited write-owner delete path; deferred rather than expanded in this lode | 59 + 60 + ## think/indexer 61 + 62 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 63 + | --- | --- | --- | --- | --- | --- | --- | --- | 64 + | `think/indexer/journal.py:876` | SQLite index database file | `reset_journal_index()` | fixed `journal/indexer/<db>` path | no | no | `⚠️` | index artifact reset; infrastructure is out of scope for retention-style parity | 65 + 66 + ## think/identity 67 + 68 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 69 + | --- | --- | --- | --- | --- | --- | --- | --- | 70 + | `think/identity.py:350` | newly created identity file | rollback on history-append failure in `_write_identity_locked()` | target is scoped to the locked identity dir and only removed on exception after create | no | no | `⚠️` | rollback delete of a just-created file, not a steady-state delete path | 71 + 72 + ## think/tools 73 + 74 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 75 + | --- | --- | --- | --- | --- | --- | --- | --- | 76 + | `think/tools/call.py:402` | source facet entity dir | facet merge when destination already has the entity | source/dest facets are validated and source dir must exist | yes (`think/tools/call.py:441-450`) | no | `⚠️` | audited merge flow, but the larger facet-merge bundle is deferred | 77 + 78 + ## apps/entities 79 + 80 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 81 + | --- | --- | --- | --- | --- | --- | --- | --- | 82 + | `apps/entities/call.py:179` | source facet entity dir | `sol call entities move --merge` when destination already has the entity | source facet, destination facet, entity resolution, and source dir existence are all checked first | yes (`apps/entities/call.py:184-193`) | no | `⚠️` | audited write-owner CLI path; deferred rather than widened in this sweep | 83 + 84 + ## apps/speakers 85 + 86 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 87 + | --- | --- | --- | --- | --- | --- | --- | --- | 88 + | `apps/speakers/routes.py:260` | `entities/<id>/voiceprints.npz` | `api_correct_attribution()` when the removed entry was the NPZ's last entry | entity memory path must resolve, NPZ must exist, and metadata tuple must match before unlink | yes (`apps/speakers/routes.py:981-994`) | no | `✅` | fixed in this sweep: audit payload now records `voiceprints_removed` only on actual unlink | 89 + | `apps/speakers/discovery.py:87` | `awareness/discovery_clusters.json` | discovery starts without an owner centroid | fixed awareness cache path | no | no | `⚠️` | awareness cache invalidation, not journal-domain deletion | 90 + | `apps/speakers/discovery.py:494` | `awareness/discovery_clusters.json` | `identify_unknown_speaker()` completes | fixed awareness cache path | no | no | `⚠️` | cache cleanup after identification; out of scope for this sweep | 91 + | `apps/speakers/owner.py:419,446` | owner-candidate NPZ | owner candidate confirm/reject flows | fixed candidate path under awareness state | no (state update only at `apps/speakers/owner.py:421-428,447-452`) | no | `⚠️` | awareness candidate lifecycle cleanup, not a journal-domain delete | 92 + 93 + ## apps/transcripts 94 + 95 + Out of scope for this sweep; keep visible because it is a destructive journal-domain route owned by a separate transcript bundle. 96 + 97 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 98 + | --- | --- | --- | --- | --- | --- | --- | --- | 99 + | `apps/transcripts/routes.py:521` | segment directory under `chronicle/<day>/<stream>/` | `DELETE /api/segment/...` | day regex, segment-key validation, existence check, and `commonpath` containment check | yes (`apps/transcripts/routes.py:524-529`) | no | `⚠️` | destructive transcript route owned by a separate bundle; tracked but out of scope here | 100 + 101 + ## apps/import 102 + 103 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 104 + | --- | --- | --- | --- | --- | --- | --- | --- | 105 + | `apps/import/routes.py:224,250` | request temp files used for timestamp detection and staged upload copy | import upload request handling | both paths come from `NamedTemporaryFile` in the same request | no | no | `⚠️` | request-scoped temp cleanup, not persisted journal deletion | 106 + | `apps/import/call.py:278,279` | staged config diff files | final config-review resolution | fixed paths under the resolved import-review `state_dir` | yes (`apps/import/call.py:281-290`) | no | `⚠️` | review-state cleanup after explicit operator resolution | 107 + | `apps/import/call.py:401,437,452` | staged entity review file | merge/create/skip entity review resolution | `staged_path` must exist under `state_dir/entities/staged` | yes (`apps/import/call.py:402-463`) | no | `⚠️` | review-state cleanup after explicit operator resolution | 108 + | `apps/import/call.py:507,583,605` | staged facet review file | skip/apply facet review resolution | `staged_path` must exist under `state_dir/facets/staged` | yes (`apps/import/call.py:508-615`) | no | `⚠️` | review-state cleanup after explicit operator resolution | 109 + 110 + ## apps/settings 111 + 112 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 113 + | --- | --- | --- | --- | --- | --- | --- | --- | 114 + | `apps/settings/routes.py:878` | canonical `journal/.config/vertex-credentials.json` | provider update clears Vertex credentials | stored path must resolve to the canonical credential path before unlink | yes (`apps/settings/routes.py:892-899`) | no | `⚠️` | config artifact cleanup with a canonical-path guard | 115 + | `apps/settings/call.py:511` | canonical `journal/.config/vertex-credentials.json` | `sol call settings vertex clear` | stored path must resolve to the canonical credential path before unlink | no | no | `⚠️` | CLI config cleanup outside the journal-domain sweep | 116 + 117 + ## apps/support 118 + 119 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 120 + | --- | --- | --- | --- | --- | --- | --- | --- | 121 + | `apps/support/routes.py:171` | uploaded attachment temp file | support attachment upload completes or fails | exact temp path created for the request | no | no | `⚠️` | request temp cleanup, not journal-domain deletion | 122 + 123 + ## observe 124 + 125 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 126 + | --- | --- | --- | --- | --- | --- | --- | --- | 127 + | `observe/observer_client.py:39` | files inside a draft capture directory | `cleanup_draft()` | iterates only files already inside the draft directory | no | no | `⚠️` | draft temp cleanup on the observe side | 128 + | `observe/sense.py:982` | derived output files for a segment | `delete_outputs()` during reprocess cleanup | delete only when the file matches the requested reprocess type and a corresponding source exists | no (logger only) | yes | `⚠️` | observe-side cleanup has dry-run support but not retention-style logging | 129 + | `observe/transcribe/main.py:546,682` | raw/audio capture files that fail VAD thresholds | transcription filtering | delete is gated by VAD outcome on the source file | no (callosum event only) | no | `⚠️` | observe-side source filtering, not part of this journal-domain sweep | 130 + | `observe/transcribe/revai.py:396` | temporary audio upload file | Rev.ai transcription request teardown | exact temp path plus `exists()` check | no | no | `⚠️` | request temp cleanup | 131 + | `observe/transcribe/whisper.py:231` | temporary audio upload file | Whisper transcription request teardown | exact temp path plus `exists()` check | no | no | `⚠️` | request temp cleanup | 132 + 133 + ## IPC/health 134 + 135 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 136 + | --- | --- | --- | --- | --- | --- | --- | --- | 137 + | `think/callosum.py:59,91` | `health/callosum.sock` | callosum server start/stop | fixed socket path and `exists()` checks | no | no | `⚠️` | IPC socket cleanup is out of scope for journal-domain parity | 138 + | `think/supervisor.py:870` | `health/callosum.sock` | supervisor pre-start stale-socket cleanup | fixed `server.socket_path` plus `exists()` check | no | no | `⚠️` | IPC socket race prevention, out of scope | 139 + | `think/heartbeat.py:91,101,138` | heartbeat PID file | stale/corrupt PID cleanup and final teardown | fixed PID path with stale/corrupt guards | no (logger only) | no | `⚠️` | service lifecycle cleanup, not journal-domain deletion | 140 + | `think/service.py:199,215` | installed service plist/unit file | service uninstall | fixed platform-specific install path and `exists()` check | no | no | `⚠️` | installed-service artifact cleanup, out of scope | 141 + | `think/install_guard.py:147,168` | owned `sol` alias symlink | install/uninstall guard | alias ownership is checked before unlink | no | no | `⚠️` | user-bin alias management, not journal-domain deletion | 142 + 143 + ## maint 144 + 145 + | file:line | target | trigger | path validation | audit log | dry-run | class | why | 146 + | --- | --- | --- | --- | --- | --- | --- | --- | 147 + | `apps/observer/maint/000_migrate_remote_to_observer.py:38` | legacy observer source file | one-shot remote-to-observer migration | migration resolves the legacy source path before delete | no | no | `⚠️` | shipped maint migration; one-shot historical cleanup | 148 + | `apps/settings/maint/002_restructure_stream_dirs.py:122` | legacy segment directory | one-shot stream-dir restructuring migration | delete happens only after migration work on that segment dir | no | no | `⚠️` | shipped maint migration; one-shot historical cleanup | 149 + | `apps/sol/maint/000_migrate_agent_layout.py:46` | legacy agent layout file | one-shot agent-layout migration | migration resolves the legacy source before unlink | no | no | `⚠️` | shipped maint migration; one-shot historical cleanup | 150 + | `apps/sol/maint/001_migrate_agent_run_logs.py:92` | legacy agent run-log file | one-shot run-log migration | delete follows successful migration of that log file | no | no | `⚠️` | shipped maint migration; one-shot historical cleanup | 151 + | `apps/sol/maint/002_migrate_chronicle.py:77,91` | legacy chronicle day dir and legacy SQLite db | one-shot chronicle migration | delete follows successful day/db migration | no | no | `⚠️` | shipped maint migration; one-shot historical cleanup | 152 + 153 + ## Deferred Follow-ups 154 + 155 + - `apps/entities/call.py:179` — audited write-owner move path; defer to a broader entities deletion parity pass. 156 + - `think/facets.py:907` — audited write-owner delete path; not a named gap for this sweep. 157 + - `think/entities/journal.py:369,375` — production route coverage exists, but helper-local parity remains deferred. 158 + - `think/entities/merge.py:520,536,697,702` — audited, commit-gated merge workflow; too broad for this lode. 159 + - `think/tools/call.py:402` — audited facet-merge flow; broader merge semantics make it a defer. 160 + - No `❌` rows remain after B1 and B2 in this sweep.