personal activity index (bluesky, leaflet, substack)
pai.desertthunder.dev
rss
bluesky
1<!-- markdownlint-disable MD033 -->
2
3# Personal Activity Index
4
5A CLI that ingests content from Substack, Bluesky, and Leaflet into SQLite, with an optional Cloudflare Worker + D1 deployment path.
6
7## Features
8
9- Fetch posts from multiple sources:
10 - **Substack** via RSS feeds
11 - **Bluesky** via AT Protocol
12 - **Leaflet** publications via RSS feeds
13- Local SQLite storage with full-text search
14- Flexible filtering and querying
15- Self-hostable or serverless (Cloudflare Workers)
16
17## Quick Start
18
19```bash
20# Install
21cargo install --path cli
22
23# Initialize config (creates ~/.config/pai/config.toml)
24pai init
25
26# Edit config with your sources
27$EDITOR ~/.config/pai/config.toml
28
29# Sync content
30pai sync
31
32# List items
33pai list -n 10
34
35# Check database
36pai db-check
37```
38
39## Configuration
40
41Configuration is loaded from `$XDG_CONFIG_HOME/pai/config.toml` or `$HOME/.config/pai/config.toml`.
42
43See [config.example.toml](./config.example.toml) for a complete example with all available options.
44
45## Architecture
46
47The project is organized as a Cargo workspace
48
49```sh
50.
51├── core # Shared types, fetchers, and the storage trait
52├── cli # CLI binary (POSIX-compliant)
53└── worker # Cloudflare Worker deployment using workers-rs
54```
55
56<details>
57<summary><strong>Source Implementations</strong></summary>
58
59### Substack (RSS)
60
61Substack fetcher uses standard RSS 2.0 feeds available at `{base_url}/feed`.
62
63**Implementation:**
64
65- Fetches RSS feed using `feed-rs` parser
66- Maps RSS `<item>` elements to standardized `Item` struct
67- Uses GUID as item ID, falls back to link if GUID is missing
68- Normalizes `pubDate` to ISO 8601 format
69
70**Key mappings:**
71
72- `id` = RSS GUID or link
73- `source_kind` = `substack`
74- `source_id` = Domain extracted from base_url
75- `title` = RSS title
76- `summary` = RSS description
77- `url` = RSS link
78- `content_html` = RSS content (if available)
79- `published_at` = RSS pubDate (normalized to ISO 8601)
80
81**Example RSS structure:**
82
83```xml
84<item>
85 <title>Post Title</title>
86 <link>https://example.substack.com/p/post-slug</link>
87 <guid>https://example.substack.com/p/post-slug</guid>
88 <pubDate>Mon, 01 Jan 2024 12:00:00 +0000</pubDate>
89 <description>Post summary or excerpt</description>
90</item>
91```
92
93### AT Protocol Integration (Bluesky)
94
95#### Overview
96
97Bluesky is built on the AT Protocol (Authenticated Transfer Protocol), a decentralized social networking protocol.
98
99**Key Concepts:**
100
101- **DID (Decentralized Identifier)**: Unique identifier for users (e.g., `did:plc:xyz123`)
102- **Handle**: Human-readable identifier (e.g., `user.bsky.social`)
103- **AT URI**: Resource identifier (e.g., `at://did:plc:xyz/app.bsky.feed.post/abc123`)
104- **Lexicon**: Schema definition language for records and API methods
105- **XRPC**: HTTP API wrapper for AT Protocol methods
106- **PDS (Personal Data Server)**: Server that stores user data
107
108#### Implementation
109
110Bluesky uses standard `app.bsky.feed.post` records and provides a public API for fetching posts.
111
112**Endpoint:** `GET https://public.api.bsky.app/xrpc/app.bsky.feed.getAuthorFeed`
113
114**Parameters:**
115
116- `actor` - User handle or DID
117- `limit` - Number of posts to fetch (default: 50)
118- `cursor` - Pagination cursor (optional)
119
120**Implementation:**
121
122- Fetches author feed using `app.bsky.feed.getAuthorFeed`
123- Filters out reposts and quotes (only includes original posts)
124- Converts AT URIs to canonical Bluesky URLs
125- Truncates long post text to create titles
126
127**Key mappings:**
128
129- `id` = AT URI (e.g., `at://did:plc:xyz/app.bsky.feed.post/abc123`)
130- `source_kind` = `bluesky`
131- `source_id` = User handle
132- `title` = Truncated post text (first 100 chars)
133- `summary` = Full post text
134- `url` = Canonical URL (`https://bsky.app/profile/{handle}/post/{post_id}`)
135- `author` = Post author handle
136- `published_at` = Post `createdAt` timestamp
137
138**Filtering reposts:**
139Posts with a `reason` field (indicating repost or quote) are excluded to fetch only original content.
140
141### Leaflet (RSS)
142
143#### Overview
144
145Leaflet publications provide RSS feeds at `{base_url}/rss`, making them straightforward to fetch using standard RSS parsing.
146
147**Note:** While Leaflet is built on AT Protocol and uses custom `pub.leaflet.post` records, we use RSS feeds for simplicity and reliability. Leaflet's RSS implementation provides all necessary metadata without requiring AT Protocol PDS queries.
148
149**Implementation:**
150
151- Fetches RSS feed using `feed-rs` parser
152- Maps RSS `<item>` elements to standardized `Item` struct
153- Supports multiple publications via config array
154- Uses entry ID from feed, falls back to link if missing
155- Normalizes publication dates to ISO 8601 format
156
157**Key mappings:**
158
159- `id` = RSS entry ID or link
160- `source_kind` = `leaflet`
161- `source_id` = Publication ID from config (e.g., `desertthunder`, `stormlightlabs`)
162- `title` = RSS entry title
163- `summary` = RSS entry summary/description
164- `url` = RSS entry link
165- `content_html` = RSS content body (if available)
166- `author` = RSS entry author
167- `published_at` = RSS published date or updated date (normalized to ISO 8601)
168
169**Configuration:**
170
171Leaflet supports multiple publications through array configuration:
172
173```toml
174[[sources.leaflet]]
175enabled = true
176id = "desertthunder"
177base_url = "https://desertthunder.leaflet.pub"
178
179[[sources.leaflet]]
180enabled = true
181id = "stormlightlabs"
182base_url = "https://stormlightlabs.leaflet.pub"
183```
184
185**Example RSS structure:**
186
187```xml
188<item>
189 <title>Dev Log: 2025-11-22</title>
190 <link>https://desertthunder.leaflet.pub/3m6a7fuk7u22p</link>
191 <guid>https://desertthunder.leaflet.pub/3m6a7fuk7u22p</guid>
192 <pubDate>Fri, 22 Nov 2025 16:22:54 +0000</pubDate>
193 <description>Post summary or excerpt</description>
194</item>
195```
196
197</details>
198
199## References
200
201- [AT Protocol Documentation](https://atproto.com)
202- [Lexicon Guide](https://atproto.com/guides/lexicon) - Schema definition language
203- [XRPC Specification](https://atproto.com/specs/xrpc) - HTTP API wrapper
204- [Bluesky API Documentation](https://docs.bsky.app/)
205- [Leaflet](https://tangled.org/leaflet.pub/leaflet) - Leaflet source code
206- [Leaflet Manual](https://about.leaflet.pub/) - User-facing documentation
207
208## License
209
210See [LICENSE file](./LICENSE) for details.