atproto user agency toolkit for individuals and groups
8
fork

Configure Feed

Select the types of activity you want to include in your feed.

Update README with current architecture, lexicon flow, and replication modes

Rewrites README to reflect the decoupled architecture (no node identity,
lazy OAuth identity, any-PDS compatibility), documents the three lexicon
record types and where they're used in the offer→replication flow, and
describes the three replication modes (reciprocal, consensual,
non-consensual archive). Updates libp2p description in CLAUDE.md.

+94 -188
+1 -1
CLAUDE.md
··· 12 12 - **Base:** Generalized from [Cirrus](https://github.com/ascorbic/cirrus) 13 13 - **HTTP:** Hono 14 14 - **Database:** better-sqlite3 (sync API) 15 - - **IPFS:** Helia (libp2p + DHT + bitswap + gossipsub) 15 + - **IPFS:** Helia (minimal libp2p: TCP + noise + yamux + autoNAT) 16 16 - **Identity:** AT Protocol DIDs via PLC directory 17 17 - **Content addressing:** DASL CIDs (CIDv1, SHA-256, dag-cbor/raw, base32lower). All CIDs must be DASL-compliant — enforced by `@atcute/cid`. 18 18 - **Desktop:** Tauri v2 (optional, `apps/desktop/`)
+93 -187
README.md
··· 2 2 3 3 Peer-to-peer replication infrastructure for AT Protocol. Backs up and serves atproto account data over IPFS, acting on behalf of authenticated users. 4 4 5 - - Syncs and stores repos and blobs for configured accounts 6 - - Provides data on P2P networks (IPFS/libp2p) for other nodes to replicate 7 - - Fetches and stores data from P2P networks for serviced accounts 8 - - Mutual replication agreements between peers via on-protocol consent records 9 - - Push-based offer discovery: nodes notify each other of replication offers 5 + P2PDS is infrastructure — like a torrent client for atproto data. It does not have its own identity. Users authenticate with their own atproto accounts, and coordination records are published to the user's own repo via their PDS. 10 6 11 - P2PDS is infrastructure — like a torrent client for atproto data. It does not have its own identity. Users authenticate with their own atproto accounts, and records (`org.p2pds.peer`, `org.p2pds.replication.offer`) are published to the user's own repo via their PDS. 7 + ## How it works 12 8 13 - ## Stack 9 + A user logs in with their atproto account. P2PDS syncs their repo and blobs into a local SQLite-backed IPFS blockstore. To replicate another user's data, the user publishes an offer record to their own repo. If the other user's node reciprocates, both nodes detect the mutual agreement and begin syncing automatically. All coordination happens through atproto records — no custom signaling protocol. 14 10 15 - - **Runtime**: Node.js, TypeScript (ES2022, strict) 16 - - **Base**: Generalized from [Cirrus](https://github.com/ascorbic/cirrus) 17 - - **HTTP**: Hono 18 - - **Database**: better-sqlite3 (sync API) — all state in a single `pds.db` file 19 - - **IPFS**: Helia with minimal libp2p (TCP + noise + yamux), SQLite-backed blockstore 20 - - **Identity**: AT Protocol DIDs via PLC directory 21 - - **Auth**: OAuth (primary) or legacy JWT (fallback) 22 - - **Desktop**: Tauri v2 (optional, `apps/desktop/`) 23 - - **Content addressing**: [DASL](https://dasl.ing/) CIDs (CIDv1, SHA-256, dag-cbor/raw, base32lower) 11 + ### Architecture 24 12 25 - ## Architecture 13 + P2PDS is fully decoupled from any specific PDS. It connects to the user's PDS via OAuth and uses standard atproto APIs for everything: 26 14 27 15 ``` 28 - User's atproto account (any PDS: Bluesky, Cirrus, self-hosted) 16 + User's atproto account (any PDS) 29 17 30 - 18 + ▼ OAuth + com.atproto.sync.* 31 19 ┌─────────┐ 32 - │ p2pds │ ← replication infrastructure (local, cloud, or co-located) 20 + │ p2pds │ replication infrastructure 33 21 │ │ 34 - │ SQLite │ blocks, blobs, sync state, peer routing, challenge history 35 - │ Helia │ IPFS storage, direct peer connections, bitswap 36 - │ Hono │ XRPC endpoints, app, RASL 22 + │ SQLite │ blocks, blobs, sync state, peer routing 23 + │ Helia │ IPFS blockstore, peer connections 24 + │ Hono │ XRPC endpoints, web UI 37 25 └─────────┘ 38 26 39 - 40 - Other p2pds nodes (mutual replication via offer records) 27 + ▼ libp2p / HTTP 28 + Other p2pds nodes 41 29 ``` 42 30 43 - ### Storage 44 - 45 - All persistent state lives in a single SQLite database (`pds.db`): 46 - 47 - - **IPFS blocks** (`ipfs_blocks`) — replaces filesystem blockstore, avoids thousands of tiny files 48 - - **IPFS datastore** (`ipfs_datastore`) — replaces filesystem datastore for libp2p peer/routing data 49 - - **Replication state** — sync progress, peer info, block/blob tracking, firehose cursor 50 - - **Challenge history** — proof-of-storage results and peer reliability scores 51 - - **Incoming offers** — offers from other nodes awaiting accept/reject 52 - - **Node identity** — DID + handle, established on first OAuth login 53 - 54 - ### Identity model 55 - 56 - P2PDS starts without an identity. On first OAuth login, the user's DID becomes the node identity and is persisted in SQLite. Subsequent restarts load the identity from the database. This "lazy identity" model means: 57 - 58 - - No DID or signing key required in config 59 - - Identity established interactively via the app 60 - - `RepoManager` is optional throughout (firehose, replication, startup all handle its absence) 61 - 62 - ### Libp2p configuration 63 - 64 - Helia runs with a minimal libp2p stack: TCP transport, Noise encryption, Yamux multiplexing, and Identify only. No DHT, gossipsub, relay, autoNAT, UPnP, or WebRTC — those services peg CPU connecting to random peers. P2PDS dials known peers directly using multiaddrs from `org.p2pds.peer` records. 65 - 66 - ### Replication flow 67 - 68 - 1. User adds a DID via the app → publishes an `org.p2pds.replication.offer` record 69 - 2. Node resolves the target's `org.p2pds.peer` record to find their p2pds endpoint 70 - 3. Node POSTs a notification to the target's `notifyOffer` endpoint 71 - 4. Target verifies the offer exists in the offerer's repo (anti-spoofing) 72 - 5. Target's app shows the incoming offer with Accept/Reject buttons 73 - 6. Accepting creates a reciprocal offer + push notification back 74 - 7. Both nodes detect mutual agreement → promote to active replication 75 - 8. Sync loop: fetch repo, store blocks/blobs, verify, announce 76 - 77 - ### Design choices 78 - 79 - - **DHT only** for discovery/routing — no IPNI or centralized indexers 80 - - **Slow data is fine** as a tradeoff for resilience and decentralization 81 - - **Transport-agnostic verification** — RASL works over any HTTP transport 82 - - **DASL-compliant content addressing** — all CIDs are CIDv1 + SHA-256 with either dag-cbor (`0x71`) or raw (`0x55`) codec, encoded as base32lower (`b` prefix). Enforced by `@atcute/cid`. 83 31 - **No node identity** — p2pds acts on behalf of users, not as its own entity 84 - - **SQLite everywhere** — single-file database, no filesystem blockstore churn 85 - - **Consent-gated replication** — "Add" publishes an offer, not an immediate sync. Replication only begins when both sides agree. 32 + - **Lazy identity** — starts without a DID; identity established on first OAuth login 33 + - **Any PDS** — works with Bluesky, self-hosted, or any atproto-compatible PDS 34 + - **Any deployment** — local desktop (Tauri), cloud, co-located server 86 35 87 - ### Deployment flexibility 36 + ### Lexicons 88 37 89 - P2PDS works with any combination of: 90 - - **PDS**: Bluesky, Cirrus, self-hosted, any atproto-compatible PDS 91 - - **Location**: local desktop (via Tauri app), cloud (Railway, etc.), co-located server 38 + P2PDS defines three record types published to the user's own repo: 92 39 93 - ## Lexicons 94 - 95 - P2PDS defines two record types for the on-protocol interop surface: 96 - 97 - | NSID | Repo key | Purpose | 98 - |------|----------|---------| 99 - | `org.p2pds.peer` | `self` | Binds an atproto DID to a libp2p PeerID + multiaddrs + p2pds endpoint URL | 100 - | `org.p2pds.replication.offer` | `any` | Declares willingness to replicate a specific DID's data | 40 + | NSID | rkey | Purpose | Used in | 41 + |------|------|---------|---------| 42 + | `org.p2pds.peer` | `self` | Binds DID → libp2p PeerID + multiaddrs + p2pds endpoint URL | Peer discovery: nodes read this to find each other's transport addresses and HTTP endpoints | 43 + | `org.p2pds.replication.offer` | DID (colons→hyphens) | Declares willingness to replicate a specific DID | Offer negotiation: mutual offers trigger automatic replication agreements | 44 + | `org.p2pds.replication.consent` | `self` | Opt-in: "I consent to being archived" | Consensual archive: peers check this before archiving without reciprocal offer | 101 45 102 46 Schemas are in `lexicons/` and validated by `src/lexicons.ts`. 103 47 104 - **Offer negotiation**: Peers publish offer records declaring willingness to replicate specific DIDs. When two peers have mutual offers (A offers to replicate B, B offers to replicate A), a replication agreement is automatically formed. Parameters are merged: `max(minCopies)`, `min(intervalSec)`, `max(priority)`. Revoking an offer = deleting the record. 105 - 106 - **Push notifications**: When a node publishes an offer, it resolves the target's `org.p2pds.peer` record to find their p2pds HTTP endpoint and POSTs a notification. The receiving node verifies the offer exists in the sender's repo before storing it (prevents spoofing). 48 + ### Replication flow 107 49 108 - ## Verification 50 + 1. User adds a DID via the web UI 51 + 2. Node publishes `org.p2pds.replication.offer` to the user's repo 52 + 3. Node resolves the target's `org.p2pds.peer` record → finds their p2pds endpoint 53 + 4. Node POSTs a push notification to the target's `notifyOffer` endpoint 54 + 5. Target verifies the offer exists in the offerer's repo (anti-spoofing) 55 + 6. Target user sees the incoming offer in their UI → Accept or Reject 56 + 7. Accepting creates a reciprocal offer + push notification back 57 + 8. Both nodes detect mutual agreement → auto-generate replication policy 58 + 9. Sync loop: fetch repo via libp2p (peer-first) or HTTP (PDS fallback), store blocks/blobs, verify, announce 59 + 10. Real-time updates via firehose subscription between periodic syncs 109 60 110 - Content-addressed retrieval is unforgeable: if a peer returns the correct bytes for a CID, they have the data. The verification stack exploits this property at multiple layers: 61 + Three replication modes: 62 + - **Reciprocal archive** — Mutual consent, bidirectional replication 63 + - **Consensual archive** — One-way replication with explicit opt-in from the target 64 + - **Non-consensual archive** — One-way replication without target's explicit permission 111 65 112 - | Layer | Name | Method | Status | 113 - |-------|------|--------|--------| 114 - | L0 | Commit root | Fetch repo root CID via RASL from remote PDS | Done | 115 - | L1 | RASL sampling | Fetch random block sample via HTTP, compare with local copy | Done | 116 - | L2 | Block-sample challenge | Challenge peers to produce specific blocks, verify via libp2p or HTTP | Done | 117 - | L3 | MST proof challenge | Challenge peers to produce Merkle path proofs for specific records | Done | 66 + ### Verification 118 67 119 - **Challenge-response protocol**: Three message types (`StorageChallenge` → `StorageChallengeResponse` → `StorageChallengeResult`). Deterministic generation from epoch + DIDs + nonce. Transport-agnostic with libp2p primary and HTTP fallback (`FailoverChallengeTransport`). Challenge history and peer reliability tracked in SQLite. 120 - 121 - ## Replication 122 - 123 - Sync loop (per DID, periodic with policy-driven intervals): 124 - 125 - 1. Resolve DID → PDS endpoint (via PLC directory) 126 - 2. Try libp2p peer-first sync if peer info is known 127 - 3. Fall back to HTTP PDS fetch (`com.atproto.sync.getRepo`, incremental via `since`) 128 - 4. Parse CAR, store blocks in IPFS, fetch and store blobs 129 - 5. Track block/blob CIDs, populate record paths via MST walk 130 - 6. Verify local block availability 131 - 7. If source PDS fails, fall back to peer endpoints 68 + Content-addressed retrieval is unforgeable: correct bytes for a CID = proof of storage. The verification stack: 132 69 133 - **Real-time sync**: Firehose subscription (`com.atproto.sync.subscribeRepos`) with cursor persistence, DID filtering, and incremental block application. Gossipsub commit notifications for low-latency cross-node coordination. 70 + | Layer | Method | 71 + |-------|--------| 72 + | L0 | Commit root — fetch repo root CID via RASL from source PDS | 73 + | L1 | RASL sampling — fetch random blocks via HTTP, compare with local | 74 + | L2 | Block-sample challenge — challenge peers to produce specific blocks | 75 + | L3 | MST proof challenge — challenge peers to produce Merkle path proofs | 134 76 135 - **GC and tombstones**: Deferred GC via `needs_gc` flag on delete/update ops. Full block/blob reconciliation during sync via MST walk. Cross-DID block sharing safety. Tombstone detection via firehose `#account` events with 24hr grace period. 77 + Challenge-response protocol: `StorageChallenge` → `StorageChallengeResponse` → `StorageChallengeResult`. Deterministic generation from epoch + DIDs + nonce. Transport-agnostic with libp2p primary and HTTP fallback. 136 78 137 - ## Policy Engine 79 + ### Storage 138 80 139 - Declarative, deterministic, transport-agnostic policy system operating on atproto accounts: 81 + All persistent state in a single SQLite database (`pds.db`): 140 82 141 - - **Mutual aid**: N-of-M redundancy between cooperating peers 142 - - **SaaS**: SLA compliance with minimum copy counts and sync intervals 143 - - **Group governance**: Multi-party replication agreements 83 + - **IPFS blocks/datastore** — SQLite-backed, no filesystem churn 84 + - **Replication state** — sync progress, peer info, block/blob tracking, firehose cursor 85 + - **Challenge history** — proof-of-storage results and peer reliability scores 86 + - **Node identity** — DID + handle, established on first OAuth login 144 87 145 - Policies drive sync intervals, priority ordering, and `shouldReplicate` filtering in the replication manager. P2P policies are auto-generated from mutual offer records with `p2p:` prefixed IDs. 88 + ### Policy engine 146 89 147 - ## App 90 + Declarative, deterministic policy system operating on atproto accounts: 148 91 149 - - **App**: Server-rendered HTML at `/` with auto-refresh, account search, incoming offer notifications 150 - - **API**: Authenticated XRPC endpoints for overview, per-DID status, network status, policies, sync history 151 - - **DID management**: Add/remove/offer DIDs at runtime via app or API 152 - - **Incoming offers**: Accept/reject replication offers from other nodes via app 153 - - **Rate limiting**: Per-IP limits across all endpoint groups (meta, sync, session, read, write, challenge, app, notifyOffer) 92 + - **Mutual aid** — N-of-M redundancy between cooperating peers 93 + - **SaaS** — SLA compliance with minimum copy counts and sync intervals 94 + - **Group governance** — Multi-party replication agreements 154 95 155 - ## Desktop App 96 + Policies drive sync intervals, priority ordering, and filtering. P2P policies are auto-generated from mutual offer records. 156 97 157 - Optional Tauri v2 wrapper at `apps/desktop/`. Spawns p2pds as a sidecar process and loads the app in a webview. 98 + ## Stack 158 99 159 - ``` 160 - cd apps/desktop 161 - npm run build:sidecar # compile p2pds to standalone binary via pkg 162 - cargo tauri dev # run in development 163 - cargo tauri build # build distributable 164 - ``` 100 + - **Runtime**: Node.js, TypeScript (ES2022, strict) 101 + - **HTTP**: Hono 102 + - **Database**: better-sqlite3 103 + - **IPFS**: Helia with minimal libp2p (TCP + noise + yamux + autoNAT) 104 + - **UI**: Lit web components, esbuild-bundled 105 + - **Identity**: AT Protocol DIDs via PLC directory 106 + - **Auth**: OAuth (primary) or legacy JWT (fallback) 107 + - **Content addressing**: [DASL](https://dasl.ing/) CIDs (CIDv1, SHA-256, dag-cbor/raw, base32lower) 108 + - **Desktop**: Tauri v2 (optional, `apps/desktop/`) 165 109 166 110 ## Development 167 111 ··· 171 115 npm run dev 172 116 ``` 173 117 174 - ### Two-node manual testing 175 - 176 - Scripts for running two p2pds nodes locally for manual testing: 118 + ### Two-node testing 177 119 178 120 ```bash 179 - npm run start:both # Build and start both nodes on random ports 180 - npm run start:node1 # Start node 1 only 181 - npm run start:node2 # Start node 2 only (uses data-node2/.env) 121 + npm run start:both # Build and start both nodes 122 + npm run start:both -- --clean # Wipe data first 182 123 npm run stop # Stop both nodes 183 - npm run clean # Wipe data for both nodes (keeps .env files) 184 - npm run logs # Show recent logs for both nodes 185 - npm run test:add-did # Offer a DID on a running node 124 + npm run logs # Tail logs for both nodes 125 + npm run health # Check node1 health 126 + npm run check-api # Full API check with auth 186 127 ``` 187 128 188 - Node 2 requires a `data-node2/.env` file with a separate `OAUTH_ENABLED=true` and `DATA_DIR=./data-node2` config. Both nodes pick random ports and write them to `/tmp/p2pds-node{1,2}.port`. 189 - 190 129 ### Project structure 191 130 192 131 ``` ··· 196 135 start.ts Server startup orchestrator 197 136 config.ts Config interface + loadConfig() 198 137 ipfs.ts IpfsService (Helia wrapper, SQLite-backed) 199 - sqlite-blockstore.ts SQLite blockstore for Helia (replaces FsBlockstore) 200 - sqlite-datastore.ts SQLite datastore for libp2p (replaces FsDatastore) 201 - repo-manager.ts Local repo management 202 - storage.ts SQLite block storage 203 - blobs.ts Blob storage 204 - middleware/auth.ts Auth middleware (OAuth + legacy JWT) 205 - oauth/ OAuth client, routes, session/state stores, PdsClient 206 - replication/ Sync, verification, challenges, offers, gossipsub 138 + build-ui.ts esbuild bundler for Lit UI 139 + ui/ Lit web components (app shell, cards, state) 140 + replication/ Sync, verification, challenges, offers 207 141 policy/ Policy engine types, engine, presets 142 + oauth/ OAuth client, routes, PdsClient 208 143 xrpc/ XRPC endpoint handlers 144 + middleware/ Auth, rate limiting, body limits 209 145 scripts/ Two-node testing scripts 210 146 lexicons/ Lexicon JSON schemas 211 147 apps/desktop/ Tauri desktop app ··· 215 151 216 152 Environment variables (or `.env` file): 217 153 218 - | Variable | Required | Default | Description | 219 - |----------|----------|---------|-------------| 220 - | `OAUTH_ENABLED` | No | `false` | Enable OAuth login (recommended) | 221 - | `PUBLIC_URL` | No | `http://localhost:$PORT` | Public URL for push notifications between nodes | 222 - | `DATA_DIR` | No | `./data` | Data directory | 223 - | `PORT` | No | `3000` | HTTP port | 224 - | `IPFS_ENABLED` | No | `true` | Enable IPFS | 225 - | `IPFS_NETWORKING` | No | `true` | Enable libp2p networking | 226 - | `REPLICATE_DIDS` | No | | Comma-separated DIDs to replicate | 227 - | `FIREHOSE_URL` | No | `wss://bsky.network/...` | Firehose WebSocket URL | 228 - | `FIREHOSE_ENABLED` | No | `true` | Enable firehose sync | 229 - | `POLICY_FILE` | No | | Path to policy JSON file | 230 - | `RATE_LIMIT_ENABLED` | No | `true` | Enable rate limiting | 231 - 232 - **Legacy auth** (when `OAUTH_ENABLED=false`): 233 - 234 - | Variable | Required | Description | 235 - |----------|----------|-------------| 236 - | `DID` | Yes | Your atproto DID | 237 - | `HANDLE` | Yes | Your handle | 238 - | `PDS_HOSTNAME` | Yes | PDS hostname | 239 - | `AUTH_TOKEN` | Yes | Static auth token | 240 - | `JWT_SECRET` | Yes | JWT signing secret | 241 - | `PASSWORD_HASH` | Yes | Bcrypt password hash | 242 - | `SIGNING_KEY` | No | Hex-encoded secp256k1 private key | 243 - 244 - ## Status 245 - 246 - 1. Single-user PDS — done 247 - 2. Record replication with IPFS storage — done 248 - 3. Real-time firehose sync — done 249 - 4. Layered verification (L0-L3) — done 250 - 5. Challenge-response proof-of-storage — done 251 - 6. Policy engine — done 252 - 7. P2P offer negotiation — done 253 - 8. Consent-gated replication — done 254 - 9. Incoming offer discovery via push notification — done 255 - 10. App UI + DID management — done 256 - 11. Rate limiting — done 257 - 12. Architecture refactor (user-DID model, lazy identity) — done 258 - 13. SQLite-backed IPFS storage — done 259 - 14. Lexicon definitions — done 260 - 15. Desktop app skeleton — done 154 + | Variable | Default | Description | 155 + |----------|---------|-------------| 156 + | `PORT` | `3000` | HTTP port | 157 + | `DATA_DIR` | `./data` | Data directory | 158 + | `OAUTH_ENABLED` | `true` | Enable OAuth login | 159 + | `PUBLIC_URL` | `http://localhost:$PORT` | Public URL for push notifications | 160 + | `IPFS_ENABLED` | `true` | Enable IPFS | 161 + | `IPFS_NETWORKING` | `true` | Enable libp2p networking | 162 + | `REPLICATE_DIDS` | | Comma-separated DIDs to replicate on startup | 163 + | `FIREHOSE_URL` | `wss://bsky.network/...` | Firehose WebSocket URL | 164 + | `FIREHOSE_ENABLED` | `true` | Enable firehose sync | 165 + | `POLICY_FILE` | | Path to policy JSON file | 166 + | `RATE_LIMIT_ENABLED` | `true` | Enable rate limiting |