search for standard sites pub-search.waow.tech
search zig blog atproto
11
fork

Configure Feed

Select the types of activity you want to include in your feed.

docs: add TAP memory tuning, status check, and catch-up monitoring

- document memory/parallelism settings to prevent OOM
- add `just check` quick status recipe
- explain how to monitor catch-up progress after downtime
- add OOM and catch-up issues to troubleshooting table

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

zzstoatzz d635f516 96eda720

+73
+73
docs/tap.md
··· 63 63 debug(zat): extractAt: parse failed for Op at path { "op" }: InvalidEnumTag 64 64 ``` 65 65 66 + ## memory and performance tuning 67 + 68 + TAP loads **entire repo CARs into memory** during resync. some bsky users have repos that are 100-300MB+. this causes spiky memory usage that can OOM the machine. 69 + 70 + ### recommended settings for leaflet-search 71 + 72 + ```toml 73 + [[vm]] 74 + memory = '2gb' # 1gb is not enough 75 + 76 + [env] 77 + TAP_RESYNC_PARALLELISM = '1' # only one repo CAR in memory at a time (default: 5) 78 + TAP_FIREHOSE_PARALLELISM = '5' # concurrent event processors (default: 10) 79 + TAP_OUTBOX_CAPACITY = '10000' # event buffer size (default: 100000) 80 + TAP_IDENT_CACHE_SIZE = '10000' # identity cache entries (default: 2000000) 81 + ``` 82 + 83 + ### why these values? 84 + 85 + - **2GB memory**: 1GB causes OOM kills when resyncing large repos 86 + - **resync parallelism 1**: prevents multiple large CARs in memory simultaneously 87 + - **lower firehose/outbox**: we track ~1000 repos, not millions - defaults are overkill 88 + - **smaller ident cache**: we don't need 2M cached identities 89 + 90 + if TAP keeps OOM'ing, check logs for large repo resyncs: 91 + ```bash 92 + fly logs -a leaflet-search-tap | grep "parsing repo CAR" | grep -E "size\":[0-9]{8,}" 93 + ``` 94 + 95 + ## quick status check 96 + 97 + from the `tap/` directory: 98 + ```bash 99 + just check 100 + ``` 101 + 102 + shows TAP machine state, most recent indexed date, and 7-day timeline. useful for verifying indexing is working after restarts. 103 + 104 + example output: 105 + ``` 106 + === TAP Status === 107 + app 781417db604d48 23 ewr started ... 108 + 109 + === Recent Indexing Activity === 110 + Last indexed: 2026-01-08 (14 docs) 111 + Today: 2026-01-11 112 + Docs: 3742 | Pubs: 1231 113 + 114 + === Timeline (last 7 days) === 115 + 2026-01-08: 14 docs 116 + 2026-01-07: 29 docs 117 + ... 118 + ``` 119 + 120 + if "Last indexed" is more than a day behind "Today", TAP may be down or catching up. 121 + 122 + ## checking catch-up progress 123 + 124 + when TAP restarts after downtime, it replays the firehose from its saved cursor. to check progress: 125 + 126 + ```bash 127 + # see current firehose position (look for timestamps in log messages) 128 + fly logs -a leaflet-search-tap | grep -E '"time".*"seq"' | tail -3 129 + ``` 130 + 131 + the `"time"` field in log messages shows how far behind TAP is. compare to current time to estimate catch-up. 132 + 133 + catch-up speed varies: 134 + - **~0.3x** when resync queue is full (large repos being fetched) 135 + - **~1x or faster** once resyncs clear 136 + 66 137 ## debugging 67 138 68 139 ### check tap connection ··· 89 160 90 161 | symptom | cause | fix | 91 162 |---------|-------|-----| 163 + | TAP machine stopped, `oom_killed=true` | large repo CARs exhausted memory | increase memory to 2GB, reduce `TAP_RESYNC_PARALLELISM` to 1 | 92 164 | `websocket handshake failed: error.Timeout` | TAP not running or network issue | restart TAP, check regions match | 93 165 | `dialing failed: lookup ... i/o timeout` | DNS issues reaching bsky relay | restart TAP, transient network issue | 94 166 | messages received but not indexed | extraction failing (type mismatch) | enable zat debug logging, check field types | 95 167 | repo shows `records: 0` after adding | resync failed or collection not in filters | check TAP logs for resync errors, verify `TAP_COLLECTION_FILTERS` | 96 168 | new platform records not appearing | platform's collection not in `TAP_COLLECTION_FILTERS` | add collection to filters, restart TAP | 169 + | indexing stopped, TAP shows "started" | TAP catching up from downtime | check firehose position in logs, wait for catch-up | 97 170 98 171 ## TAP API endpoints 99 172