my prefect server setup prefect-metrics.waow.tech
python orchestration
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

fix DuckDB lock contention from grafana datasource plugin

the motherduck DuckDB plugin was holding an exclusive write lock on the
live analytics.duckdb, blocking all flow writes. add a busybox sidecar
to the grafana pod that maintains a snapshot copy every 60s — grafana
reads the copy, flows own the live DB.

also stagger tangled-items to :02 (was :00 with gh-notifications) to
avoid the two ingestion flows contending on DuckDB writes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zzstoatzz 812bdb62 30b65620

+57 -11
+53 -7
deploy/monitoring-values.yaml
··· 78 78 allow_loading_unsigned_plugins: motherduck-duckdb-datasource 79 79 80 80 # plugin not in Grafana registry — pre-installed on node via `just install-grafana-plugin` 81 - # mounted as read-only hostPath into grafana's plugin directory 81 + # grafana reads a snapshot copy of the DuckDB — never the live file. 82 + # a busybox sidecar refreshes the snapshot every 60s so the DuckDB plugin 83 + # can't take an exclusive lock on the production database. 84 + 85 + extraContainerVolumeMounts: 86 + - name: duckdb-plugin 87 + mountPath: /var/lib/grafana/plugins/motherduck-duckdb-datasource 88 + readOnly: true 89 + - name: analytics-snapshot 90 + mountPath: /analytics 91 + readOnly: true 92 + 93 + extraInitContainers: 94 + # seed the snapshot before grafana starts so the datasource has data on first boot 95 + - name: snapshot-seed 96 + image: busybox:1 97 + command: ["sh", "-c"] 98 + args: 99 + - | 100 + src=/analytics-src/analytics.duckdb 101 + dst=/analytics-snap/analytics.duckdb 102 + until [ -f "$src" ]; do echo "waiting for $src"; sleep 5; done 103 + cp "$src" "$dst" 104 + volumeMounts: 105 + - name: analytics-duckdb 106 + mountPath: /analytics-src 107 + readOnly: true 108 + - name: analytics-snapshot 109 + mountPath: /analytics-snap 110 + 111 + extraContainers: 112 + # refresh snapshot every 60s — grafana reads the copy, never the live DB 113 + - name: snapshot-sync 114 + image: busybox:1 115 + command: ["sh", "-c"] 116 + args: 117 + - | 118 + src=/analytics-src/analytics.duckdb 119 + dst=/analytics-snap/analytics.duckdb 120 + while true; do 121 + [ -f "$src" ] && cp "$src" "$dst" 122 + sleep 60 123 + done 124 + volumeMounts: 125 + - name: analytics-duckdb 126 + mountPath: /analytics-src 127 + readOnly: true 128 + - name: analytics-snapshot 129 + mountPath: /analytics-snap 130 + 82 131 extraVolumes: 83 132 - name: duckdb-plugin 84 133 hostPath: ··· 88 137 hostPath: 89 138 path: /var/lib/prefect-analytics 90 139 type: DirectoryOrCreate 140 + - name: analytics-snapshot 141 + emptyDir: {} 91 142 92 - extraVolumeMounts: 93 - - name: duckdb-plugin 94 - mountPath: /var/lib/grafana/plugins/motherduck-duckdb-datasource 95 - readOnly: true 96 - - name: analytics-duckdb 97 - mountPath: /analytics 143 + extraVolumeMounts: [] 98 144 99 145 additionalDataSources: 100 146 - name: DuckDB
+3 -3
docs/hub.md
··· 4 4 5 5 ## data sources 6 6 7 - two ingestion flows run hourly at :00, writing to separate DuckDB tables: 7 + two ingestion flows run hourly, staggered to avoid DuckDB write contention: 8 8 9 9 **gh-notifications** (`flows/gh_notifications.py`) — fetches github notifications (issues + PRs) and open items authored by `zzstoatzz` via the search API. each issue is cached by repo+number for 24h. persists to `raw_github_issues`. 10 10 ··· 21 21 └────────────────────┘ 22 22 23 23 tangled PDS ──► tangled-items ───► raw_tangled_items ────┘ 24 - (hourly :00) │ 24 + (hourly :02) │ 25 25 26 26 hub_action_items 27 27 (mart, top 200) ··· 41 41 |---|---|---| 42 42 | `diagnostics` | `*/5 * * * *` | prints system info — canary for worker health | 43 43 | `gh-notifications` | `0 * * * *` | github notifications + authored open issues/PRs → `raw_github_issues` | 44 - | `tangled-items` | `0 * * * *` | tangled.org issues/PRs/comments → `raw_tangled_items` | 44 + | `tangled-items` | `2 * * * *` | tangled.org issues/PRs/comments → `raw_tangled_items` | 45 45 | `enrich` | `5 * * * *` | dbt build: staging → enrichment → mart. concurrency limit 1. runs under python 3.13 (dbt-core compat) | 46 46 | `curate` | `10 * * * *` | loads top 200 scored items, sends to claude haiku 4.5 via pydantic-ai, writes `briefing.json` | 47 47 | `cleanup` | `0 2 * * 0` | deletes old terminal flow runs (completed, failed, cancelled, crashed) older than 30 days |
+1 -1
prefect.yaml
··· 36 36 entrypoint: flows/tangled_items.py:tangled_items 37 37 work_pool: *k8s 38 38 schedules: 39 - - cron: "0 * * * *" # hourly 39 + - cron: "2 * * * *" # staggered from gh-notifications to avoid DuckDB write contention 40 40 41 41 - name: enrich 42 42 entrypoint: flows/enrich.py:enrich