atproto user agency toolkit for individuals and groups
8
fork

Configure Feed

Select the types of activity you want to include in your feed.

Update README and CLAUDE.md for architecture refactor

Rewrite README to reflect the new user-DID model, DASL compliance
requirement, lexicon definitions, verification layers, policy engine,
desktop app, and current project status. Update CLAUDE.md to match.

+159 -61
+15 -20
CLAUDE.md
··· 4 4 5 5 ## Project Overview 6 6 7 - P2PDS is an AT Protocol (atproto) Personal Data Server with P2P capabilities. It syncs and stores records for a set of accounts, provides them on P2P networks, and fetches/stores records from P2P networks for serviced accounts. 7 + P2PDS is peer-to-peer replication infrastructure for AT Protocol. It syncs and stores account data (repos, blobs) for configured DIDs, provides them over IPFS/libp2p, and fetches data from other p2pds nodes. It acts on behalf of authenticated atproto users — it has no identity of its own. 8 8 9 - ## Planned Tech Stack 9 + ## Tech Stack 10 10 11 - - **Runtime:** Node.js 12 - - **Base:** Generalized version of [Cirrus](https://github.com/ascorbic/cirrus) (Cloudflare-specific parts abstracted away) 13 - - **IPFS:** Helia with DHT and pubsub enabled 11 + - **Runtime:** Node.js, TypeScript (ES2022, strict, NodeNext modules) 12 + - **Base:** Generalized from [Cirrus](https://github.com/ascorbic/cirrus) 13 + - **HTTP:** Hono 14 + - **Database:** better-sqlite3 (sync API) 15 + - **IPFS:** Helia (libp2p + DHT + bitswap + gossipsub) 14 16 - **Identity:** AT Protocol DIDs via PLC directory 15 - - **Addressing:** DASL addresses for IPFS-stored records 17 + - **Content addressing:** DASL CIDs (CIDv1, SHA-256, dag-cbor/raw, base32lower). All CIDs must be DASL-compliant — enforced by `@atcute/cid`. 18 + - **Desktop:** Tauri v2 (optional, `apps/desktop/`) 16 19 17 - ## Architecture (Planned) 20 + ## Architecture 18 21 19 - The system is configured with a list of DIDs and operates as follows: 20 - 1. On first run, queries PLC directory to resolve PDSes for all configured DIDs 21 - 2. Fetches records from each DID's PDS 22 - 3. Stores and provides records over IPFS using DASL addresses 23 - 4. Syncs records bidirectionally between PDS and IPFS 24 - 25 - Open design problem: DID-to-PeerID mapping (both can update/rotate). 26 - 27 - ## Development Phases 28 - 29 - 1. Single-user PDS working as local node service 30 - 2. Record replication with local storage 31 - 3. IPFS integration for replicated records 22 + - Users authenticate with their own atproto accounts (any PDS) 23 + - Records (`org.p2pds.peer`, `org.p2pds.replication.offer`) publish to the user's own repo 24 + - No node identity — p2pds is infrastructure, like a torrent client 25 + - Mutual offers between peers auto-generate replication policies 26 + - Challenge-response protocol verifies peers actually store the data they claim to 32 27 33 28 ## Tool Usage Rules 34 29
+144 -41
README.md
··· 1 1 # P2PDS 2 2 3 - An AT Protocol Personal Data Server with P2P replication. 3 + Peer-to-peer replication infrastructure for AT Protocol. Backs up and serves atproto account data over IPFS, acting on behalf of authenticated users. 4 4 5 - - Syncs and stores records for a set of accounts 6 - - Provides records on P2P networks for other nodes to sync and store 7 - - Fetches and stores records from P2P networks for serviced accounts 5 + - Syncs and stores repos and blobs for configured accounts 6 + - Provides data on P2P networks (IPFS/libp2p) for other nodes to replicate 7 + - Fetches and stores data from P2P networks for serviced accounts 8 + - Mutual replication agreements between peers via on-protocol consent records 9 + 10 + P2PDS is infrastructure — like a torrent client for atproto data. It does not have its own identity. Users authenticate with their own atproto accounts, and records (`org.p2pds.peer`, `org.p2pds.replication.offer`) are published to the user's own repo via their PDS. 8 11 9 12 ## Stack 10 13 11 - - **Runtime**: Node.js, TypeScript 14 + - **Runtime**: Node.js, TypeScript (ES2022, strict) 12 15 - **Base**: Generalized from [Cirrus](https://github.com/ascorbic/cirrus) 13 16 - **HTTP**: Hono 14 - - **Database**: better-sqlite3 15 - - **IPFS**: Helia (libp2p + DHT + FsBlockstore) 17 + - **Database**: better-sqlite3 (sync API) 18 + - **IPFS**: Helia (libp2p + DHT + bitswap + gossipsub) 16 19 - **Identity**: AT Protocol DIDs via PLC directory 20 + - **Desktop**: Tauri v2 (optional, `apps/desktop/`) 21 + - **Content addressing**: [DASL](https://dasl.ing/) CIDs (CIDv1, SHA-256, dag-cbor/raw, base32lower) 17 22 18 23 ## Architecture 19 24 25 + ``` 26 + User's atproto account (any PDS: Bluesky, Cirrus, self-hosted) 27 + 28 + 29 + ┌─────────┐ 30 + │ p2pds │ ← replication infrastructure (local, cloud, or co-located) 31 + │ │ 32 + │ SQLite │ block/blob tracking, sync state, challenge history 33 + │ Helia │ IPFS storage, DHT announcements, bitswap, gossipsub 34 + │ Hono │ XRPC endpoints, admin dashboard, RASL 35 + └─────────┘ 36 + 37 + 38 + Other p2pds nodes (mutual replication via offer records) 39 + ``` 40 + 20 41 Configured with a list of DIDs to replicate: 21 42 22 43 1. Resolves DIDs via PLC directory to find source PDSes 23 - 2. Fetches repos as CAR files from each DID's PDS 44 + 2. Fetches repos as CAR files from each DID's PDS (incremental via `since`) 24 45 3. Stores blocks in IPFS (Helia) and announces via DHT 25 46 4. Serves blocks via content-addressed RASL endpoint 26 - 5. Publishes peer identity and replication manifests as atproto records (`org.p2pds.peer`, `org.p2pds.manifest`) 27 - 6. Verifies block availability on remote peers via layered verification 47 + 5. Real-time sync via firehose (`com.atproto.sync.subscribeRepos`) 48 + 6. Gossipsub notifications for low-latency cross-node sync 49 + 7. Verifies block availability on remote peers via challenge-response protocol 28 50 29 - Design choices: 51 + ### Design choices 30 52 31 53 - **DHT only** for discovery/routing — no IPNI or centralized indexers 32 54 - **Slow data is fine** as a tradeoff for resilience and decentralization 33 55 - **Transport-agnostic verification** — RASL works over any HTTP transport 56 + - **DASL-compliant content addressing** — all CIDs are CIDv1 + SHA-256 with either dag-cbor (`0x71`) or raw (`0x55`) codec, encoded as base32lower (`b` prefix). This is enforced at the library level by `@atcute/cid` and matches atproto's CID conventions. See [DASL CID spec](https://dasl.ing/cid.html). 57 + - **No node identity** — p2pds acts on behalf of users, not as its own entity. Records publish to the user's own atproto repo. 34 58 35 - ## Verification Layers 59 + ### Deployment flexibility 60 + 61 + P2PDS works with any combination of: 62 + - **PDS**: Bluesky, Cirrus, self-hosted, any atproto-compatible PDS 63 + - **Location**: local desktop (via Tauri app), cloud (Railway, etc.), co-located server 64 + 65 + ## Lexicons 66 + 67 + P2PDS defines two record types for the on-protocol interop surface: 68 + 69 + | NSID | Repo key | Purpose | 70 + |------|----------|---------| 71 + | `org.p2pds.peer` | `self` | Binds an atproto DID to a libp2p PeerID + multiaddrs | 72 + | `org.p2pds.replication.offer` | `any` | Declares willingness to replicate a specific DID's data | 73 + 74 + Schemas are in `lexicons/` and validated by `src/lexicons.ts`. 75 + 76 + **Offer negotiation**: Peers publish offer records declaring willingness to replicate specific DIDs. When two peers have mutual offers (A offers to replicate B, B offers to replicate A), a replication agreement is automatically formed. Parameters are merged: `max(minCopies)`, `min(intervalSec)`, `max(priority)`. Revoking an offer = deleting the record. 77 + 78 + ## Verification 36 79 37 80 Content-addressed retrieval is unforgeable: if a peer returns the correct bytes for a CID, they have the data. The verification stack exploits this property at multiple layers: 38 81 39 82 | Layer | Name | Method | Status | 40 83 |-------|------|--------|--------| 41 - | L0 | Commit root | Fetch repo root CID via RASL from remote PDS | Implemented | 42 - | L1 | RASL sampling | Fetch random block sample via HTTP, compare with local copy | Implemented | 43 - | L2 | libp2p+HTTP | Same RASL verification logic over libp2p transports (P2P HTTP) | Blocked on Helia | 44 - | L3 | MST path proof | Verify Merkle path proofs via `com.atproto.sync.getRecord` | Future | 84 + | L0 | Commit root | Fetch repo root CID via RASL from remote PDS | Done | 85 + | L1 | RASL sampling | Fetch random block sample via HTTP, compare with local copy | Done | 86 + | L2 | Block-sample challenge | Challenge peers to produce specific blocks, verify via libp2p or HTTP | Done | 87 + | L3 | MST proof challenge | Challenge peers to produce Merkle path proofs for specific records | Done | 45 88 46 - **L0** and **L1** run on a configurable timer (default 30 min), independent from the sync timer. L1 samples are tuneable via `VerificationConfig.raslSampleSize` (default 50 blocks). 89 + **Challenge-response protocol**: Three message types (`StorageChallenge` → `StorageChallengeResponse` → `StorageChallengeResult`). Deterministic generation from epoch + DIDs + nonce. Transport-agnostic with libp2p primary and HTTP fallback (`FailoverChallengeTransport`). Challenge history and peer reliability tracked in SQLite. 47 90 48 - ### L2 blocker 91 + **L0** and **L1** run on a configurable timer (default 30 min). L1 samples are tuneable via `VerificationConfig.raslSampleSize` (default 50 blocks). 49 92 50 - L2 reuses the same HTTP/RASL verification from L1 but over libp2p transports — giving P2P properties (NAT traversal, encryption, no public IP required) with HTTP simplicity. This requires the [libp2p+HTTP Gateway spec](https://specs.ipfs.tech/http-gateways/libp2p-gateway/) to be implemented in Helia. 93 + ### L2 (libp2p+HTTP Gateway) — future 94 + 95 + Reuses RASL verification logic over libp2p transports for NAT traversal and encryption without public IP. Requires [libp2p+HTTP Gateway spec](https://specs.ipfs.tech/http-gateways/libp2p-gateway/) in Helia. 51 96 52 - - Kubo (Go) has this: [ipfs/kubo#10049](https://github.com/ipfs/kubo/issues/10049) (shipped) 53 - - Helia (JS) does not yet: [ipfs/helia#348](https://github.com/ipfs/helia/issues/348) (trustless gateway over libp2p listed as future/out-of-scope) 97 + - Kubo (Go): [ipfs/kubo#10049](https://github.com/ipfs/kubo/issues/10049) (shipped) 98 + - Helia (JS): [ipfs/helia#348](https://github.com/ipfs/helia/issues/348) (not yet) 54 99 55 100 ## Replication 56 101 57 - Nodes declare their IPFS identity and replication commitments via AT Protocol records: 102 + Sync loop (per DID, periodic with policy-driven intervals): 103 + 104 + 1. Resolve DID → PDS endpoint (via PLC directory) 105 + 2. Fetch repo (`com.atproto.sync.getRepo`, incremental via `since`) 106 + 3. Parse CAR, store blocks in IPFS, fetch and store blobs 107 + 4. Track block/blob CIDs, populate record paths via MST walk 108 + 5. Announce to DHT 109 + 6. Verify local block availability 110 + 7. If source PDS fails, fall back to peer endpoints 58 111 59 - - **`org.p2pds.peer/self`** — Binds a DID to a libp2p PeerID + multiaddrs. Updated on startup if PeerID changes. 60 - - **`org.p2pds.manifest/{did-rkey}`** — One per replicated DID. Declares "I serve this DID's data" with sync status. 112 + **Real-time sync**: Firehose subscription (`com.atproto.sync.subscribeRepos`) with cursor persistence, DID filtering, and incremental block application. Gossipsub commit notifications for low-latency cross-node coordination. 61 113 62 - Sync loop (per DID, periodic): 114 + **GC and tombstones**: Deferred GC via `needs_gc` flag on delete/update ops. Full block/blob reconciliation during sync via MST walk. Cross-DID block sharing safety. Tombstone detection via firehose `#account` events with 24hr grace period. 63 115 64 - 1. Resolve DID → PDS endpoint (via PLC directory) 65 - 2. Discover peer info (`org.p2pds.peer/self` record) 66 - 3. Fetch repo (`com.atproto.sync.getRepo`, incremental via `since`) 67 - 4. Parse CAR, store blocks in IPFS 68 - 5. Track block CIDs for verification 69 - 6. Announce to DHT 70 - 7. Verify local block availability 71 - 8. Update manifest record with sync rev 116 + ## Policy Engine 117 + 118 + Declarative, deterministic, transport-agnostic policy system operating on atproto accounts: 119 + 120 + - **Mutual aid**: N-of-M redundancy between cooperating peers 121 + - **SaaS**: SLA compliance with minimum copy counts and sync intervals 122 + - **Group governance**: Multi-party replication agreements 123 + 124 + Policies drive sync intervals, priority ordering, and `shouldReplicate` filtering in the replication manager. P2P policies are auto-generated from mutual offer records with `p2p:` prefixed IDs. 125 + 126 + ## Admin 127 + 128 + - **Dashboard**: Server-rendered HTML at `/xrpc/org.p2pds.admin.dashboard` (auto-refresh) 129 + - **API**: Authenticated XRPC endpoints for overview, per-DID status, network status, policies, sync history 130 + - **DID management**: Add/remove DIDs at runtime via `addDid`/`removeDid` endpoints 131 + - **Rate limiting**: Per-IP and per-DID limits across HTTP, gossipsub, and libp2p 132 + 133 + ## Desktop App 134 + 135 + Optional Tauri v2 wrapper at `apps/desktop/`. Spawns p2pds as a sidecar process and loads the admin dashboard in a webview. 136 + 137 + ``` 138 + cd apps/desktop 139 + npm run build:sidecar # compile p2pds to standalone binary via pkg 140 + cargo tauri dev # run in development 141 + cargo tauri build # build distributable 142 + ``` 72 143 73 144 ## Development 74 145 ··· 78 149 npm run dev 79 150 ``` 80 151 152 + ### Project structure 153 + 154 + ``` 155 + src/ 156 + index.ts Hono app with all routes 157 + server.ts HTTP server entry point 158 + config.ts Config interface + loadConfig() 159 + validation.ts Record validator (atproto + p2pds lexicons) 160 + lexicons.ts p2pds lexicon loader + validator 161 + ipfs.ts IpfsService (Helia wrapper) 162 + repo-manager.ts Local repo management 163 + storage.ts SQLite block storage 164 + blobs.ts Blob storage 165 + middleware/auth.ts Auth middleware 166 + replication/ Sync, verification, challenges, offers, gossipsub 167 + policy/ Policy engine types, engine, presets 168 + xrpc/ XRPC endpoint handlers 169 + lexicons/ Lexicon JSON schemas 170 + apps/desktop/ Tauri desktop app 171 + ``` 172 + 81 173 ### Configuration 82 174 83 175 Environment variables (or `.env` file): 84 176 85 177 | Variable | Required | Description | 86 178 |----------|----------|-------------| 87 - | `DID` | Yes | Your DID (e.g., `did:plc:...`) | 179 + | `DID` | Yes | Your atproto DID (e.g., `did:plc:...`) | 88 180 | `HANDLE` | Yes | Your handle (e.g., `user.example.com`) | 89 181 | `PDS_HOSTNAME` | Yes | PDS hostname | 90 - | `AUTH_TOKEN` | Yes | Auth token | 182 + | `AUTH_TOKEN` | Yes | Static auth token | 91 183 | `SIGNING_KEY` | Yes | Hex-encoded secp256k1 private key | 92 - | `SIGNING_KEY_PUBLIC` | Yes | Multibase-encoded public key | 184 + | `SIGNING_KEY_PUBLIC` | No | Multibase-encoded public key | 93 185 | `JWT_SECRET` | Yes | JWT signing secret | 94 186 | `PASSWORD_HASH` | Yes | Bcrypt password hash | 95 187 | `DATA_DIR` | No | Data directory (default: `./data`) | ··· 97 189 | `IPFS_ENABLED` | No | Enable IPFS (default: `true`) | 98 190 | `IPFS_NETWORKING` | No | Enable IPFS networking (default: `true`) | 99 191 | `REPLICATE_DIDS` | No | Comma-separated DIDs to replicate | 192 + | `FIREHOSE_URL` | No | Firehose WebSocket URL | 193 + | `FIREHOSE_ENABLED` | No | Enable firehose sync (default: `false`) | 194 + | `POLICY_FILE` | No | Path to policy JSON file | 100 195 101 - ## Phases 196 + ## Status 102 197 103 - 1. Single-user PDS working as local node service — **done** 104 - 2. Record replication with IPFS storage — **done** 105 - 3. Layered verification — **done** (L0, L1); **blocked** (L2); **future** (L3) 106 - 4. Policy engine — **research** 198 + 1. Single-user PDS — done 199 + 2. Record replication with IPFS storage — done 200 + 3. Real-time firehose sync — done 201 + 4. Layered verification (L0-L3) — done 202 + 5. Challenge-response proof-of-storage — done 203 + 6. Policy engine — done 204 + 7. P2P offer negotiation — done 205 + 8. Admin dashboard + DID management — done 206 + 9. Rate limiting — done 207 + 10. Architecture refactor (user-DID model) — done 208 + 11. Lexicon definitions — done 209 + 12. Desktop app skeleton — done