personal data pipeline that digests my github and tangled.org activity, scores items by importance, and generates an LLM-curated briefing. self-hosted on a single hetzner VM (k3s) running prefect OSS.
github API ──┐
├──► ingest ──► raw_github_issues ──┐
tangled PDS ─┘ (hourly) raw_tangled_items ──┤
▼
transform (dbt)
[on ingest ✓]
│
▼
hub_action_items
(top 200)
│
┌──────────────┼──────────┐
▼ ▼ ▼
brief /api/cards hub UI
[on transform ✓]
│
▼
briefing.json
see docs/hub.md for the full pipeline breakdown.
deployment
prerequisites#
setup#
cp .env.example .env
# fill in: HCLOUD_TOKEN, POSTGRES_PASSWORD, AUTH_STRING, DOMAIN, LETSENCRYPT_EMAIL
uv sync # install workspace (mps + root)
deploy#
just init # terraform init
just infra # create the VM
just kubeconfig # wait for k3s, fetch kubeconfig
just deploy # cert-manager, prefect server, monitoring, dashboards
just worker # deploy the kubernetes worker
just storage # create analytics hostPath + results PVC, patch work pool
after deploy, point your DNS:
$DOMAIN→ server IP (just server-ip)$GRAFANA_DOMAIN→ same IP (default:prefect-metrics.waow.tech)hub.waow.tech→ same IP
verify#
just health # curl the /api/health endpoint
just status # node + pod resource usage
just prefect work-pool ls
operations#
just logs # tail prefect-server logs (default)
just logs worker # tail worker logs
just prefect flow-run ls # run any prefect CLI command remotely
just dashboards # reload grafana dashboards from deploy/dashboards/
just ssh # ssh into the server
flow deployments are registered automatically on every push to main via .tangled/workflows/deploy.yml.
hub (sveltekit frontend)#
just web # build + push + deploy hub.waow.tech
analytics (dbt + duckdb)#
just init-analytics # first-time: dbt deps, seed, compile