personal activity index (bluesky, leaflet, substack) pai.desertthunder.dev
rss bluesky
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

1<!-- markdownlint-disable MD033 --> 2 3# Personal Activity Index 4 5A CLI that ingests content from Substack, Bluesky, Leaflet, and BearBlog into SQLite, with an optional Cloudflare Worker + D1 deployment path. 6 7## Features 8 9- Fetch posts from multiple sources: 10 - **Substack** via RSS feeds 11 - **Bluesky** via AT Protocol 12 - **Leaflet** publications via RSS feeds 13 - **BearBlog** publications via RSS feeds 14- Local SQLite storage with full-text search 15- Flexible filtering and querying via `pai list` / `pai export` 16- Self-hostable HTTP API (`pai serve` exposes `/api/feed`, `/api/item/{id}`, and `/status`) 17- Cloudflare Worker deployment path (D1) for serverless setups 18 19## Quick Start 20 21```bash 22# Install 23cargo install --path cli 24 25# Initialize config (creates ~/.config/pai/config.toml) 26pai init 27 28# Edit config with your sources 29$EDITOR ~/.config/pai/config.toml 30 31# Sync content 32pai sync 33 34# List items 35pai list -n 10 36 37# Check database 38pai db-check 39 40# Install the manpage so `man pai` works 41pai man --install 42 43# Generate manpage to a file 44pai man -o pai.1 45``` 46 47<details> 48<summary>For server mode, run the built-in HTTP server against your SQLite database:</summary> 49 50<br> 51 52```bash 53pai serve -d /var/lib/pai/pai.db -a 127.0.0.1:8080 54``` 55 56Endpoints: 57 58- `GET /api/feed` – list newest items (supports `source_kind`, `source_id`, `limit`, `since`, `q`) 59- `GET /api/item/{id}` – fetch a single item 60- `GET /status` – health/status summary (total items, counts per source) 61 62For reverse-proxy examples (nginx, Caddy, Docker), see [DEPLOYMENT.md](./DEPLOYMENT.md). 63 64</details> 65 66## Configuration 67 68Configuration is loaded from `$XDG_CONFIG_HOME/pai/config.toml` or `$HOME/.config/pai/config.toml`. 69 70See [config.example.toml](./config.example.toml) for a complete example with all available options. 71 72## Documentation 73 74- CLI synopsis: `pai -h`, `pai <command> -h`, or `pai man` for the generated `pai(1)` page. 75- `pai man --install [--install-dir DIR]` copies `pai.1` into a MANPATH directory (defaults to `~/.local/share/man/man1`) so `man pai` works like any other UNIX tool. 76- Database schema and config reference: [config.example.toml](./config.example.toml). 77- Deployment topologies: [DEPLOYMENT.md](./DEPLOYMENT.md). 78 79## Architecture 80 81The project is organized as a Cargo workspace 82 83```sh 84. 85├── core # Shared types, fetchers, and the storage trait 86├── cli # CLI binary (POSIX-compliant) 87└── worker # Cloudflare Worker deployment using workers-rs 88``` 89 90<details> 91<summary><strong>Source Implementations</strong></summary> 92 93### Substack (RSS) 94 95Substack fetcher uses standard RSS 2.0 feeds available at `{base_url}/feed`. 96 97**Implementation:** 98 99- Fetches RSS feed using `feed-rs` parser 100- Maps RSS `<item>` elements to standardized `Item` struct 101- Uses GUID as item ID, falls back to link if GUID is missing 102- Normalizes `pubDate` to ISO 8601 format 103 104**Key mappings:** 105 106- `id` = RSS GUID or link 107- `source_kind` = `substack` 108- `source_id` = Domain extracted from base_url 109- `title` = RSS title 110- `summary` = RSS description 111- `url` = RSS link 112- `content_html` = RSS content (if available) 113- `published_at` = RSS pubDate (normalized to ISO 8601) 114 115**Example RSS structure:** 116 117```xml 118<item> 119 <title>Post Title</title> 120 <link>https://example.substack.com/p/post-slug</link> 121 <guid>https://example.substack.com/p/post-slug</guid> 122 <pubDate>Mon, 01 Jan 2024 12:00:00 +0000</pubDate> 123 <description>Post summary or excerpt</description> 124</item> 125``` 126 127### AT Protocol Integration (Bluesky) 128 129#### Overview 130 131Bluesky is built on the AT Protocol (Authenticated Transfer Protocol), a decentralized social networking protocol. 132 133**Key Concepts:** 134 135- **DID (Decentralized Identifier)**: Unique identifier for users (e.g., `did:plc:xyz123`) 136- **Handle**: Human-readable identifier (e.g., `user.bsky.social`) 137- **AT URI**: Resource identifier (e.g., `at://did:plc:xyz/app.bsky.feed.post/abc123`) 138- **Lexicon**: Schema definition language for records and API methods 139- **XRPC**: HTTP API wrapper for AT Protocol methods 140- **PDS (Personal Data Server)**: Server that stores user data 141 142#### Implementation 143 144Bluesky uses standard `app.bsky.feed.post` records and provides a public API for fetching posts. 145 146**Endpoint:** `GET https://public.api.bsky.app/xrpc/app.bsky.feed.getAuthorFeed` 147 148**Parameters:** 149 150- `actor` - User handle or DID 151- `limit` - Number of posts to fetch (default: 50) 152- `cursor` - Pagination cursor (optional) 153 154**Implementation:** 155 156- Fetches author feed using `app.bsky.feed.getAuthorFeed` 157- Filters out reposts and quotes (only includes original posts) 158- Converts AT URIs to canonical Bluesky URLs 159- Truncates long post text to create titles 160 161**Key mappings:** 162 163- `id` = AT URI (e.g., `at://did:plc:xyz/app.bsky.feed.post/abc123`) 164- `source_kind` = `bluesky` 165- `source_id` = User handle 166- `title` = Truncated post text (first 100 chars) 167- `summary` = Full post text 168- `url` = Canonical URL (`https://bsky.app/profile/{handle}/post/{post_id}`) 169- `author` = Post author handle 170- `published_at` = Post `createdAt` timestamp 171 172**Filtering reposts:** 173Posts with a `reason` field (indicating repost or quote) are excluded to fetch only original content. 174 175### Leaflet (RSS) 176 177#### Overview 178 179Leaflet publications provide RSS feeds at `{base_url}/rss`, making them straightforward to fetch using standard RSS parsing. 180 181**Note:** While Leaflet is built on AT Protocol and uses custom `pub.leaflet.post` records, we use RSS feeds for simplicity and reliability. Leaflet's RSS implementation provides all necessary metadata without requiring AT Protocol PDS queries. 182 183**Implementation:** 184 185- Fetches RSS feed using `feed-rs` parser 186- Maps RSS `<item>` elements to standardized `Item` struct 187- Supports multiple publications via config array 188- Uses entry ID from feed, falls back to link if missing 189- Normalizes publication dates to ISO 8601 format 190 191**Key mappings:** 192 193- `id` = RSS entry ID or link 194- `source_kind` = `leaflet` 195- `source_id` = Publication ID from config (e.g., `desertthunder`, `stormlightlabs`) 196- `title` = RSS entry title 197- `summary` = RSS entry summary/description 198- `url` = RSS entry link 199- `content_html` = RSS content body (if available) 200- `author` = RSS entry author 201- `published_at` = RSS published date or updated date (normalized to ISO 8601) 202 203**Configuration:** 204 205Leaflet supports multiple publications through array configuration: 206 207```toml 208[[sources.leaflet]] 209enabled = true 210id = "desertthunder" 211base_url = "https://desertthunder.leaflet.pub" 212 213[[sources.leaflet]] 214enabled = true 215id = "stormlightlabs" 216base_url = "https://stormlightlabs.leaflet.pub" 217``` 218 219**Example RSS structure:** 220 221```xml 222<item> 223 <title>Dev Log: 2025-11-22</title> 224 <link>https://desertthunder.leaflet.pub/3m6a7fuk7u22p</link> 225 <guid>https://desertthunder.leaflet.pub/3m6a7fuk7u22p</guid> 226 <pubDate>Fri, 22 Nov 2025 16:22:54 +0000</pubDate> 227 <description>Post summary or excerpt</description> 228</item> 229``` 230 231### BearBlog (RSS) 232 233#### Overview 234 235BearBlog is a minimalist blogging platform that provides RSS feeds at `{slug}.bearblog.dev/feed/`, making them straightforward to fetch using standard RSS parsing. 236 237**Implementation:** 238 239- Fetches RSS feed using `feed-rs` parser 240- Maps RSS `<item>` elements to standardized `Item` struct 241- Supports multiple blogs via config array 242- Uses entry ID from feed, falls back to link if missing 243- Normalizes publication dates to ISO 8601 format 244 245**Key mappings:** 246 247- `id` = RSS entry ID or link 248- `source_kind` = `bearblog` 249- `source_id` = Blog ID from config (e.g., `desertthunder`) 250- `title` = RSS entry title 251- `summary` = RSS entry summary/description 252- `url` = RSS entry link 253- `content_html` = RSS content body (if available) 254- `author` = RSS entry author 255- `published_at` = RSS published date or updated date (normalized to ISO 8601) 256 257**Configuration:** 258 259BearBlog supports multiple blogs through array configuration: 260 261```toml 262[[sources.bearblog]] 263enabled = true 264id = "desertthunder" 265base_url = "https://desertthunder.bearblog.dev" 266 267[[sources.bearblog]] 268enabled = true 269id = "another-blog" 270base_url = "https://another-blog.bearblog.dev" 271``` 272 273**Example RSS structure:** 274 275```xml 276<item> 277 <title>My Blog Post</title> 278 <link>https://desertthunder.bearblog.dev/my-blog-post</link> 279 <guid>https://desertthunder.bearblog.dev/my-blog-post</guid> 280 <pubDate>Fri, 22 Nov 2025 16:22:54 +0000</pubDate> 281 <description>Post summary or excerpt</description> 282</item> 283``` 284 285</details> 286 287## References 288 289- [AT Protocol Documentation](https://atproto.com) 290- [Lexicon Guide](https://atproto.com/guides/lexicon) - Schema definition language 291- [XRPC Specification](https://atproto.com/specs/xrpc) - HTTP API wrapper 292- [Bluesky API Documentation](https://docs.bsky.app/) 293- [Leaflet](https://tangled.org/leaflet.pub/leaflet) - Leaflet source code 294- [Leaflet Manual](https://about.leaflet.pub/) - User-facing documentation 295 296## License 297 298See [LICENSE](./LICENSE)