···11+# Example seed handles for Twister graph backfill
22+# One DID or handle per line. Comments and blank lines are ignored.
33+44+anirudh.fi
55+atprotocol.dev
66+zzstoatzz.io
77+oppi.li
88+desertthunder.dev
99+tangled.org
+1-1
docs/api/specs/07-graph-backfill.md
···96969797| Flag | Default | Description |
9898| --------------- | -------- | ----------------------------------------------- |
9999-| `--seeds` | required | Path to seed file |
9999+| `--seeds` | required | Seed source: file path or comma-separated list |
100100| `--max-hops` | `2` | Max fan-out depth from seed users |
101101| `--dry-run` | `false` | List discovered users without submitting to Tap |
102102| `--concurrency` | `5` | Parallel discovery workers |
+11-9
docs/api/tasks/phase-1-mvp.md
···134134135135### Tasks
136136137137-- [ ] Implement `backfill` subcommand with flags:
137137+- [x] Implement `backfill` subcommand with flags:
138138 - `--seeds <file>` — required seed file path
139139 - `--max-hops <n>` — depth limit for fan-out (default: 2)
140140 - `--dry-run` — print the discovery plan without mutating Tap
141141 - `--concurrency <n>` — parallel discovery workers (default: 5)
142142 - `--batch-size <n>` — DIDs per `/repos/add` request
143143 - `--batch-delay <duration>` — delay between Tap registration batches
144144-- [ ] Implement seed file parsing:
144144+- [x] Implement seed file parsing:
145145 - One DID or handle per line
146146 - `#` comments allowed
147147 - Blank lines ignored
148148 - Handles resolved to DIDs before graph expansion
149149-- [ ] Decide and document the initial seed file location for operators:
149149+- [x] Decide and document the initial seed file location for operators:
150150 - Repository-managed example file for format/reference
151151 - Deployment-specific runtime file or mounted secret for real runs
152152-- [ ] Implement graph discovery:
152152+ - Implemented: `docs/api/seeds.txt` and `packages/api/internal/backfill/doc.go`
153153+- [x] Implement graph discovery:
153154 1. Start from hop-0 seed users
154155 2. Fetch `sh.tangled.graph.follow` records and collect subject DIDs
155156 3. Fetch repo collaborators by inspecting repos, issues, PRs, and comments
156157 4. Enqueue newly discovered DIDs with hop metadata
157158 5. Stop expanding beyond `max-hops`
158158-- [ ] Track discovery metadata for logs:
159159+- [x] Track discovery metadata for logs:
159160 - source DID
160161 - hop depth
161162 - discovery reason (`seed`, `follow`, `collaborator`)
162162-- [ ] Integrate with Tap admin endpoints:
163163+- [x] Integrate with Tap admin endpoints:
163164 - `GET /info/:did` to skip already-tracked repos when practical
164165 - `POST /repos/add` to register new DIDs for backfill
165165-- [ ] Make the command safe to re-run:
166166+- [x] Make the command safe to re-run:
166167 - in-memory visited DID set during crawl
167168 - tolerate duplicate `/repos/add`
168169 - rely on index upsert idempotency for re-delivered records
169169-- [ ] Add operator-friendly logging:
170170+- [x] Add operator-friendly logging:
170171 - seed count
171172 - users discovered per hop
172173 - already-tracked vs newly-submitted DIDs
173174 - batch progress
174175 - final totals
175175-- [ ] Add a short runbook covering:
176176+- [x] Add a short runbook covering:
176177 - first bootstrap against an empty database
177178 - repeat run after expanding the seed list
178179 - dry-run before production mutation
180180+ - Implemented: `packages/api/internal/backfill/doc.go`
179181180182### Verification
181183