personal data pipeline that digests my github and tangled.org activity, scores items by importance, and generates an LLM-curated briefing. self-hosted on a single hetzner VM (k3s) running prefect OSS.
github API ──► gh-notifications ──► raw_github_issues ──┐
(hourly :00) │
▼
enrich (dbt, :05)
│
tangled PDS ──► tangled-items ───► raw_tangled_items ────┘
(hourly :02) │
▼
hub_action_items
(top 200)
│
┌──────────────┼──────────┐
▼ ▼ ▼
curate (:10) /api/cards hub UI
│
▼
briefing.json
see docs/hub.md for the full pipeline breakdown.
deployment
prerequisites#
setup#
cp .env.example .env
# fill in: HCLOUD_TOKEN, POSTGRES_PASSWORD, AUTH_STRING, DOMAIN, LETSENCRYPT_EMAIL
uv sync # install workspace (mps + root)
deploy#
just init # terraform init
just infra # create the VM
just kubeconfig # wait for k3s, fetch kubeconfig
just deploy # cert-manager, prefect server, monitoring, dashboards
just worker # deploy the kubernetes worker
just storage # create analytics hostPath + results PVC, patch work pool
after deploy, point your DNS:
$DOMAIN→ server IP (just server-ip)$GRAFANA_DOMAIN→ same IP (default:prefect-metrics.waow.tech)hub.waow.tech→ same IP
verify#
just health # curl the /api/health endpoint
just status # node + pod resource usage
just prefect work-pool ls
operations#
just logs # tail prefect-server logs (default)
just logs worker # tail worker logs
just prefect flow-run ls # run any prefect CLI command remotely
just dashboards # reload grafana dashboards from deploy/dashboards/
just ssh # ssh into the server
flow deployments are registered automatically on every push to main via .tangled/workflows/deploy.yml.
hub (sveltekit frontend)#
just web # build + push + deploy hub.waow.tech
analytics (dbt + duckdb)#
just init-analytics # first-time: dbt deps, seed, compile