declarative relay deployment on hetzner relay-eval.waow.tech
atproto relay
14
fork

Configure Feed

Select the types of activity you want to include in your feed.

fix indigo relay coverage gap: bump account limits, fix cronjob auth, raise default limit

- bumped per-host account limits for 16 over-limit hosts via changeLimits API
- fixed reconnect cronjob phase 2: /admin/pds/list fetch was missing auth headers
- added RELAY_DEFAULT_ACCOUNT_LIMIT=10000 env var to prevent recurrence

ref: https://bsky.app/profile/bnewbold.net/post/3mgtbaicg322f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

zzstoatzz 49ab727b eaa81c3b

+57 -1
+51
docs/ops-changelog.md
··· 31 31 32 32 --- 33 33 34 + ## 2026-03-11 35 + 36 + ### fix: indigo relay coverage gap (98.44% → targeting 99.9%) 37 + 38 + **problem**: Pulsar showed relay.waow.tech missing ~3,620 users. primary cause: 39 + default per-host account limit of 100. 21 non-bsky hosts exceeded this — every 40 + account beyond 100 was permanently marked `host-throttled` and all events dropped. 41 + biggest offenders: atproto.brid.gy (12,911 accounts, 12,811 throttled), 42 + eurosky.social (5,328), blacksky.app (4,889). total: ~25K+ throttled accounts. 43 + 44 + secondary cause: reconnect cronjob phase 2 fetched `/admin/pds/list` without auth 45 + headers → 401 → fell back to empty set → bsky.network host diff was broken. 46 + 47 + **note from bryan (@bnewbold.net)**: the account limit also drives an "events per 48 + hour" quota — so even hosts under the account cap can get rate-limited during 49 + active periods. his recommendation: give high-traffic PDS hosts very high limits 50 + (e.g. 100K) to avoid both account throttling and event rate throttling. 51 + 52 + **fix (account limits)**: used `POST /admin/pds/changeLimits` for 16 hosts 53 + (atproto.brid.gy:50K, eurosky:100K, blacksky:20K, bsky.aenead.net:10K, 12 others:5K). 54 + endpoint auto-upgrades host-throttled accounts to active. 55 + 56 + **fix (cronjob auth)**: added `Request` object with `HEADERS` (which already 57 + contains Basic auth) to the `/admin/pds/list` fetch in reconnect-cronjob.yaml. 58 + 59 + **fix (default limit)**: added `RELAY_DEFAULT_ACCOUNT_LIMIT: "10000"` env var to 60 + relay-values.yaml so new hosts start with a higher limit. 61 + 62 + ### gotcha: bjw-s app-template `args` vs image CMD 63 + 64 + **mistake**: tried to pass `--default-account-limit=10000` via the helm chart's 65 + `args` field. this maps to k8s container `args`, which REPLACES the image's `CMD`. 66 + the relay image uses `dumb-init` as `ENTRYPOINT` with the relay binary as `CMD` — 67 + so `args: ["--default-account-limit=10000"]` caused `dumb-init` to try executing 68 + the flag string as a binary. CrashLoopBackOff. 69 + 70 + **compounding factor**: `helm upgrade --wait` timed out, marking the release as 71 + failed. helm's failed-release rollback didn't actually revert the k8s deployment 72 + spec. removing `args` from values and re-upgrading didn't clear it either — helm 73 + uses strategic merge patching which can't remove fields that were never in the 74 + chart's default values. had to `kubectl patch --type=json` with an explicit remove 75 + op to clear the stale `args`. 76 + 77 + **lesson**: for bjw-s/app-template, NEVER use `args` to pass flags to the 78 + application — it replaces the image CMD. use env vars instead (indigo relay 79 + supports `RELAY_DEFAULT_ACCOUNT_LIMIT` / `RELAY_DEFAULT_REPO_LIMIT`). if you 80 + do accidentally set `args`, you may need a json patch to remove it since helm 81 + strategic merge won't clear it. 82 + 83 + --- 84 + 34 85 ## 2026-03-09 35 86 36 87 ### fix: collection index missing `update` ops (0adf187, deployed)
+5 -1
indigo/deploy/reconnect-cronjob.yaml
··· 75 75 print("\nphase 2: pulling hosts from bsky.network...") 76 76 our_hosts = set() 77 77 try: 78 - resp = urllib.request.urlopen(f"{RELAY_URL}/admin/pds/list", timeout=30) 78 + req = urllib.request.Request( 79 + f"{RELAY_URL}/admin/pds/list", 80 + headers=HEADERS, 81 + ) 82 + resp = urllib.request.urlopen(req, timeout=30) 79 83 for rec in json.loads(resp.read()): 80 84 our_hosts.add(rec["Host"]) 81 85 except Exception as e:
+1
indigo/deploy/relay-values.yaml
··· 16 16 LOG_LEVEL: "info" 17 17 GOMEMLIMIT: "7GiB" 18 18 GOMAXPROCS: "4" 19 + RELAY_DEFAULT_ACCOUNT_LIMIT: "10000" 19 20 envFrom: 20 21 - secretRef: 21 22 name: relay-secret