···11+# Convey error handling - Wave 3 design
22+33+## Overview
44+55+Wave 3 closes the remaining Tier 3 shell, onboarding, and background-task error-handling gaps across 10 sites: `init.html` observers/finalize, shell background registration, chat-bar hydrate, month stats shell, todos background badge/nudges, support background verification, support ticket detail, and support announcements.
66+77+This wave is mostly shell-template work. Reuse the Wave 0 primitives, add one tiny `AppServices` helper for non-task background failures, and keep the support badge backend contract explicitly out of scope.
88+99+## Conventions carried forward
1010+1111+- Keep `window.logError(err, { context: ... })` alongside every owner-visible failure surface. Logging and UI are separate requirements.
1212+- Use `err.serverMessage` as the primary owner copy when present; fall back to the explicit strings below.
1313+- Do not add retry buttons. Reload is the recovery path unless the site already has its own action button.
1414+- Reuse Wave 0 primitives in place: `window.apiJson`, `window.SurfaceState.errorCard`, `window.SurfaceState.replaceLoading`, and `AppServices.registerTask`.
1515+- UI timeouts and background-task failure thresholds are owner-patience signals, not backend SLAs.
1616+1717+## Decision log
1818+1919+| Decision | Chosen | Rationale | Rejected alternatives | Final copy |
2020+| --- | --- | --- | --- | --- |
2121+| D-init | Shipped 2026-04-23. Add `<script src="error-handler.js">` and `<script src="api.js">` to `convey/templates/init.html`; do not load `app.js` or `websocket.js`; add local `.error-message` styling in `init.html` because onboarding does not load `app.css`. Verified: `convey/static/api.js` is standalone and `convey/static/error-handler.js` is safe without shell DOM or `appEvents`. | Smallest DRY path. Avoids micro-inlining shared helpers into onboarding. | Inline `apiJson`/`logError`; add `app.js`; add `websocket.js`; extract a shared partial just for two scripts. | None. |
2222+| D-finalize | Shipped 2026-04-23. Move `finalize()` to `window.apiJson('/init/finalize', ...)`; add `#finalize-error` next to the CTA; log with context `init-finalize`. Keep the password field-local `showFieldStatus(...)` path for the current backend 400 password validation case and do not double-surface that error in the new slot. | `apiJson` fixes non-2xx and malformed JSON; the new slot fixes the current "button does nothing" failure. Current backend only returns 400 for password length, so field-local handling remains precise. | Password-only surfacing; silent catch; redirect on failure; modal/global error. | Password field on 400: current server string `Password must be at least 8 characters` via `err.serverMessage`. Finalize slot fallback: `Couldn't finalize setup. Check your connection and try again.` |
2323+| D-observers | Shipped 2026-04-23. Add `#observer-error` as a separate node from `#observer-empty`; switch `loadObservers()` to `window.apiJson('/init/observers')`; hide `#observer-error` on success and hide `#observer-empty` while an error is showing; log with context `init-observers`. | Empty state and error state must stop sharing the same element. `apiJson` gives one transport contract. | Reuse `#observer-empty`; inline `fetch(...).json()`; silent keep-empty behavior. | Primary: `err.serverMessage`. Fallback: `Couldn't check for observers - reload to try again.` |
2424+| D-bg-register | Shipped 2026-04-23. Add `AppServices.markBackgroundFailing(appName, error)` beside `registerTask()` / `getTaskHealth()` in `convey/static/app.js`. `convey/templates/app.html` background catch calls that helper plus `window.logError(err, { context: 'app-bg-register', app: '{{ app_name }}' })`. Helper only adds `.menu-item-bg-failing` to `.menu-item[data-app-name="<app>"]` and no-ops if missing. | Centralizes shell-side failing-pip behavior for background scripts that fail before they can register a task. Keeps the app loop running. | Direct DOM mutation inside template catch; fake `registerTask()` records; popup notifications for registration errors. | None - pip only. |
2525+| D-chat-hydrate | Shipped 2026-04-23. On hydrate failure, call `setPendingState(true)`, `setStatus("Couldn't load recent chat session. Reload to try again.", "Couldn't load recent chat session. Reload to try again.")`, and `window.logError(err, { context: 'chat-hydrate' })`. Verified: `setStatus()` only accepts `(text, title)` and has no error variant. Do not add a new chat-bar state. | Reuses the existing disabled affordance on `#chatBarInput` / `#chatBarSend` and keeps the change local to `convey/templates/app.html`. | Invent a chat-bar error variant/class; leave the bar interactive; blank status on error. | `Couldn't load recent chat session. Reload to try again.` |
2626+| D-month-shell | Shipped 2026-04-23. Use a normalized provider contract: `date_nav.html` providers return `{ data, error }`, with `window.apiJson(...)` inside the provider. `convey/static/month-picker.js` caches `{ data, error, facet }`, preserves stale `data` on same-month/same-facet refetch failure, logs with context `month-stats`, adds a warning glyph on `#date-nav-label` via CSS state when `error` is present, and renders a small inline dropdown error above the grid only when the current month has no cached data for the active facet. Clear glyph/title/inline error on next success. | Current picker has no internal header to tint and must remain non-blocking. A normalized `{ data, error }` contract keeps the provider and picker responsibilities explicit. | Block with `SurfaceState.errorCard`; add a new header row inside the picker; keep collapsing failures to empty months. | Label tooltip: `err.serverMessage || "Couldn't load month stats."` Inline first-open copy: `Couldn't load month stats.` plus `err.serverMessage` as secondary text. |
2727+| D-todos-bg | Shipped 2026-04-23. Migrate `updateBadge()` to `AppServices.registerTask('todos', 'update-badge', { intervalMs: 5 * 60 * 1000, run, onSuccess })`. Migrate `checkNudges()` to `AppServices.registerTask('todos', 'check-nudges', { run })` with no `intervalMs`; verified `registerTask()` still performs the initial run when `intervalMs` is absent. Both task `run` functions use the task-scoped `apiJson`; validate `count` and `nudges` shape before mutating badge state or scheduling timers. Keep `_nudgeTimers` dedupe exactly as-is. | Matches support for badge cadence, fixes init-only dark failures, and avoids adding nudge re-fetch dedupe state. The current `registerTask()` contract already supports init-only work cleanly. | Poll nudges; fake init-only with a 24h interval; keep raw `fetch(...)`; add new background-task framework. | No new site-specific copy. Shared `registerTask()` failure notification remains `todos background task failing` with the thrown message. |
2828+| D-support-bg | Shipped 2026-04-23. No client change. Keep `apps/support/background.html` as-is and record that Wave 0 already migrated it to `registerTask(intervalMs=5m)`. Flag the backend contract gap instead. | The client already uses the intended primitive. The missing behavior is server-side: the badge route never returns failure or 403 to the task. | Client-side heuristics on `count === 0`; server fix in this wave. | None. |
2929+| D-open-ticket | Shipped 2026-04-23. Move `openTicket()` to `window.apiJson('/app/support/api/tickets/' + id)`; log with context `support-open-ticket`; on failure render a `support-empty` card into `#ticket-detail` that matches `loadTickets()`'s support-local error shape, but keep the existing back affordance/button. | `loadTickets()` is already the gold-standard pattern in this workspace. Reuse the support-local card instead of mixing in a second visual language. | Keep raw `fetch()` and sparse-ticket rendering; switch to a generic `SurfaceState` card; hide the detail pane on error. | Heading fallback: `Couldn't load ticket.` Hint: `Go back and select it again.` Button: `Back to tickets`. |
3030+| D-announcements | Shipped 2026-04-23. Add local `announcementsFirstPaintDone` and `announcementsLastSuccessAt` state in `apps/support/workspace.html`. Move `loadAnnouncements()` to `window.apiJson(...)`; on first-paint failure render inline error copy in the banner slot and log with context `support-announcements`; on later failure after a successful load, preserve the prior banner content and append a singleton stale-indicator sibling with timestamp text. On success clear stale UI and set the flags. Note: current HEAD only calls `loadAnnouncements()` once, so the stale branch is forward-compatible and does not add a new refresh trigger. | Stops `!ok` from disappearing silently and preserves stale content if the function is ever re-run. The extra flags are the minimum state needed for the split surface. | Silent `!ok` return; always replace banner on failure; add a retry button or periodic polling. | First paint: `Couldn't load announcements. Reload to try again.` Refresh stale: `Couldn't refresh announcements - showing last known state.` Timestamp suffix: `Last updated {local time}.` |
3131+3232+## Implementation order
3333+3434+### Bundle 1
3535+3636+Target commit message: `convey: load api helpers in init shell`
3737+3838+| File | Change |
3939+| --- | --- |
4040+| `convey/templates/init.html` | Add `error-handler.js` and `api.js` includes; add local `.error-message` styles; add `#observer-error` and `#finalize-error` slots. |
4141+4242+### Bundle 2
4343+4444+Target commit message: `convey: surface init observer and finalize failures`
4545+4646+| File | Change |
4747+| --- | --- |
4848+| `convey/templates/init.html` | Migrate `loadObservers()` and `finalize()` to `window.apiJson`; wire the new slots; keep password validation field-local; add `window.logError` calls. |
4949+5050+### Bundle 3
5151+5252+Target commit message: `convey: mark shell background and chat hydrate failures`
5353+5454+| File | Change |
5555+| --- | --- |
5656+| `convey/static/app.js` | Add `AppServices.markBackgroundFailing(appName, error)` beside `registerTask()` / `getTaskHealth()`. |
5757+| `convey/templates/app.html` | Use the new helper in the per-app background catch; migrate chat hydrate failure to disabled input + status copy + `logError`. |
5858+5959+### Bundle 4
6060+6161+Target commit message: `convey: signal month stats failures in date nav`
6262+6363+| File | Change |
6464+| --- | --- |
6565+| `convey/templates/date_nav.html` | Change provider to `window.apiJson(...)` and return `{ data, error }`. |
6666+| `convey/static/month-picker.js` | Normalize/cache `{ data, error, facet }`; preserve stale data on failure; render inline picker error and warning state; log failures. |
6767+| `convey/static/app.css` | Add label warning-glyph state and small picker-error styles. |
6868+6969+### Bundle 5
7070+7171+Target commit message: `todos: migrate badge and nudge background fetches`
7272+7373+| File | Change |
7474+| --- | --- |
7575+| `apps/todos/background.html` | Wrap badge polling and nudge scheduling in `AppServices.registerTask`; keep badge polling at 5 minutes; keep nudges init-only; add light shape validation. |
7676+7777+### Bundle 6
7878+7979+Target commit message: `support: harden ticket detail loads`
8080+8181+| File | Change |
8282+| --- | --- |
8383+| `apps/support/workspace.html` | Move `openTicket()` to `window.apiJson`; render support-local error card with back affordance; add `window.logError`. |
8484+| `apps/support/background.html` | Verify only - no edit expected. |
8585+| `apps/support/routes.py` | Reference only - follow-up, no edit in Wave 3. |
8686+8787+### Bundle 7
8888+8989+Target commit message: `support: split announcements first-paint and stale errors`
9090+9191+| File | Change |
9292+| --- | --- |
9393+| `apps/support/workspace.html` | Add `announcementsFirstPaintDone` / `announcementsLastSuccessAt`; move `loadAnnouncements()` to `window.apiJson`; split first-paint vs stale refresh surfaces; add `window.logError`. |
9494+9595+## Validation plan
9696+9797+- Run `make ci`.
9898+- Run `make test`.
9999+- Grep sweeps after implementation:
100100+ - confirm the migrated sites no longer use raw `fetch(...).json()` at `init observers/finalize`, chat hydrate, ticket detail, announcements, and todos background.
101101+ - confirm `AppServices.markBackgroundFailing` is the only template-side path for non-task background registration failure.
102102+ - confirm month-picker code now references the normalized `{ data, error }` contract and the warning-state selector.
103103+- Screenshot plan:
104104+ - capture `/init` with forced observers failure and forced finalize failure to verify the new inline slots do not collide with `#observer-empty` or the password field.
105105+ - capture one date-nav app with month-picker first-open failure and one stale-month failure to verify the label glyph plus inline picker error.
106106+ - capture `/app/support` with ticket-detail failure and with announcements first-paint failure / stale indicator.
107107+- Manual smoke:
108108+ - open `convey/static/tests/register-task.html` to confirm the shared background-task behavior still passes after adding `markBackgroundFailing`.
109109+110110+## Out-of-scope follow-ups
111111+112112+1. `apps/support/routes.py:256-272` collapses errors to `200 {"count": 0}`. Server contract change needed so the support `registerTask` sees real failures / 403-disabled.
113113+2. `init.html` gains `logError` calls but onboarding has no shell-level `#error-log` sink. Inline slots are the owner-visible surface; logging is telemetry-only at init.
114114+3. Wave 2 D9 - `apps/sol/workspace.html:1371` `sol updated-days` deferred because the server collapses failure to `[]`.
115115+116116+### Also noted during audit
117117+118118+- `apps/activities/_day.html:953` - Wave 1 activity loader missing `window.logError`.
119119+- `apps/settings/workspace.html:3564` - pre-existing undocumented empty catch after `saveControl`; `onError` at `:3562` logs, but the suppressing catch is undocumented.
120120+121121+## Audit evidence
122122+123123+- Grep sweeps passed after audit fixup: migrated Wave 3 sites no longer have raw `fetch(...).json()` at the migrated lines; recursive empty-catch sweep is empty; no `console.error` remains in the touched Wave 3 areas; all expected `logError` contexts are present (`init-observers`, `init-finalize`, `app-bg-register`, `chat-hydrate`, `month-stats`, `support-open-ticket`, `support-announcements`).
124124+- `make test`: `3915 passed, 5 skipped, 1 warning`.
125125+- `make test-app APP=todos`: `85 passed`; `apps/support/tests/` absent.
126126+- `make verify-browser`: not run because `tests/verify_browser.py` does not cover the Wave 3 failure paths.
127127+- Screenshots: not captured; direct `sol screenshot` required a running stack, and the sandbox reported ready but exited before screenshots could connect.