···11+# Design System Inspired by Spotify
22+33+## 1. Visual Theme & Atmosphere
44+55+Spotify's web interface is a dark, immersive music player that wraps listeners in a near-black cocoon (`#121212`, `#181818`, `#1f1f1f`) where album art and content become the primary source of color. The design philosophy is "content-first darkness" — the UI recedes into shadow so that music, podcasts, and playlists can glow. Every surface is a shade of charcoal, creating a theater-like environment where the only true color comes from the brand Accent Purple (`#a855f7`) and the album artwork itself.
66+77+The typography uses SpotifyMixUI and SpotifyMixUITitle — proprietary fonts from the CircularSp family (Circular by Lineto, customized for Spotify) with an extensive fallback stack that includes Arabic, Hebrew, Cyrillic, Greek, Devanagari, and CJK fonts, reflecting Spotify's global reach. The type system is compact and functional: 700 (bold) for emphasis and navigation, 600 (semibold) for secondary emphasis, and 400 (regular) for body. Buttons use uppercase with positive letter-spacing (1.4px–2px) for a systematic, label-like quality.
88+99+What distinguishes Spotify is its pill-and-circle geometry. Primary buttons use 500px–9999px radius (full pill), circular play buttons use 50% radius, and search inputs are 500px pills. Combined with heavy shadows (`rgba(0,0,0,0.5) 0px 8px 24px`) on elevated elements and a unique inset border-shadow combo (`rgb(18,18,18) 0px 1px 0px, rgb(124,124,124) 0px 0px 0px 1px inset`), the result is an interface that feels like a premium audio device — tactile, rounded, and built for touch.
1010+1111+**Key Characteristics:**
1212+1313+- Near-black immersive dark theme (`#121212`–`#1f1f1f`) — UI disappears behind content
1414+- Accent Purple (`#a855f7`) as singular brand accent — never decorative, always functional
1515+- SpotifyMixUI/CircularSp font family with global script support
1616+- Pill buttons (500px–9999px) and circular controls (50%) — rounded, touch-optimized
1717+- Uppercase button labels with wide letter-spacing (1.4px–2px)
1818+- Heavy shadows on elevated elements (`rgba(0,0,0,0.5) 0px 8px 24px`)
1919+- Semantic colors: negative red (`#f3727f`), warning orange (`#ffa42b`), announcement blue (`#539df5`)
2020+- Album art as the primary color source — the UI is achromatic by design
2121+- Light/dark theme toggle — persisted in localStorage, dark is default
2222+2323+## 2. Color Palette & Roles
2424+2525+All theme-dependent colors use CSS custom properties defined on `:root` (dark, default) and `[data-theme="light"]` (light). Accent and semantic colors are constant across themes.
2626+2727+### Tailwind Token Mapping
2828+2929+The `spot.*` namespace provides semantic tokens. Theme-dependent tokens resolve to CSS variables.
3030+3131+| Token | CSS Variable | Dark (default) | Light | Use |
3232+| ---------------------- | --------------------- | -------------------------------- | -------------------------------- | -------------------------------- |
3333+| `spot.purple` | — | `#a855f7` | `#a855f7` | Primary accent |
3434+| `spot.purple-border` | — | `#9333ea` | `#9333ea` | Accent border variant |
3535+| `spot.bg` | `--spot-bg` | `#121212` | `#f5f5f5` | Page background |
3636+| `spot.surface` | `--spot-surface` | `#181818` | `#ffffff` | Cards, containers, sidebar |
3737+| `spot.hover` | `--spot-hover` | `#1f1f1f` | `#f3f4f6` | Interactive surface, input bg |
3838+| `spot.hover-50` | `--spot-hover-50` | `rgba(31,31,31,0.5)` | `rgba(243,244,246,0.5)` | Card hover with transparency |
3939+| `spot.text` | `--spot-text` | `#ffffff` | `#111827` | Primary text, headings |
4040+| `spot.secondary` | `--spot-secondary` | `#b3b3b3` | `#6b7280` | Secondary text, muted labels |
4141+| `spot.body` | `--spot-body` | `#cbcbcb` | `#374151` | Article body text |
4242+| `spot.muted` | `--spot-muted` | `#4d4d4d` | `#9ca3af` | Very muted text, rank numbers |
4343+| `spot.divider` | `--spot-divider` | `rgba(77,77,77,0.2)` | `#e5e7eb` | Subtle border dividers |
4444+| `spot.divider-30` | `--spot-divider-30` | `rgba(77,77,77,0.3)` | `#d1d5db` | Slightly stronger dividers, hr |
4545+| `spot.outline` | `--spot-outline` | `#7c7c7c` | `#d1d5db` | Visible borders on buttons/inputs|
4646+| `spot.placeholder` | `--spot-placeholder` | `#4d4d4d` | `#9ca3af` | Input placeholder text |
4747+| `spot.active-pill-bg` | `--spot-active-bg` | `#ffffff` | `#111827` | Active filter pill background |
4848+| `spot.active-pill-text`| `--spot-active-text` | `#121212` | `#ffffff` | Active filter pill text |
4949+| `spot.shadow` | `--spot-shadow` | `rgba(0,0,0,0.3) 0px 8px 8px` | `rgba(0,0,0,0.08) 0px 2px 8px` | Card elevation shadow |
5050+| `spot.shadow-heavy` | `--spot-shadow-heavy` | `rgba(0,0,0,0.5) 0px 8px 24px` | `rgba(0,0,0,0.1) 0px 4px 16px` | Dialog/elevated panel shadow |
5151+| `spot.red` | — | `#f3727f` | `#f3727f` | Error states |
5252+| `spot.orange` | — | `#ffa42b` | `#ffa42b` | Warning states |
5353+| `spot.blue` | — | `#539df5` | `#539df5` | Info states |
5454+5555+### Shadows
5656+5757+- **Card** (`var(--spot-shadow)`): Cards, dropdowns
5858+- **Heavy** (`var(--spot-shadow-heavy)`): Dialogs, menus, elevated panels
5959+- **Inset Border** (`rgb(18,18,18) 0px 1px 0px, rgb(124,124,124) 0px 0px 0px 1px inset`): Input border-shadow combo (dark only)
6060+6161+## 3. Theme Toggle
6262+6363+The app supports dark (default) and light themes. Theme preference is stored in `localStorage` under the key `theme`.
6464+6565+### Implementation
6666+6767+- CSS custom properties are defined on `:root` for dark and `[data-theme="light"]` for light
6868+- A `<script>` block runs before render to set `data-theme` on `<html>`, preventing flash of wrong theme
6969+- Toggle buttons appear in the sidebar footer (desktop) and mobile top bar
7070+- Icon: moon (when in dark mode) / sun (when in light mode)
7171+- Default: dark
7272+7373+### CSS Variables
7474+7575+```css
7676+:root {
7777+ --spot-bg: #121212;
7878+ --spot-surface: #181818;
7979+ --spot-hover: #1f1f1f;
8080+ --spot-hover-50: rgba(31,31,31,0.5);
8181+ --spot-text: #ffffff;
8282+ --spot-secondary: #b3b3b3;
8383+ --spot-body: #cbcbcb;
8484+ --spot-muted: #4d4d4d;
8585+ --spot-divider: rgba(77,77,77,0.2);
8686+ --spot-divider-30: rgba(77,77,77,0.3);
8787+ --spot-outline: #7c7c7c;
8888+ --spot-placeholder: #4d4d4d;
8989+ --spot-active-bg: #ffffff;
9090+ --spot-active-text: #121212;
9191+ --spot-shadow: rgba(0,0,0,0.3) 0px 8px 8px;
9292+ --spot-shadow-heavy: rgba(0,0,0,0.5) 0px 8px 24px;
9393+}
9494+[data-theme="light"] {
9595+ --spot-bg: #f5f5f5;
9696+ --spot-surface: #ffffff;
9797+ --spot-hover: #f3f4f6;
9898+ --spot-hover-50: rgba(243,244,246,0.5);
9999+ --spot-text: #111827;
100100+ --spot-secondary: #6b7280;
101101+ --spot-body: #374151;
102102+ --spot-muted: #9ca3af;
103103+ --spot-divider: #e5e7eb;
104104+ --spot-divider-30: #d1d5db;
105105+ --spot-outline: #d1d5db;
106106+ --spot-placeholder: #9ca3af;
107107+ --spot-active-bg: #111827;
108108+ --spot-active-text: #ffffff;
109109+ --spot-shadow: rgba(0,0,0,0.08) 0px 2px 8px;
110110+ --spot-shadow-heavy: rgba(0,0,0,0.1) 0px 4px 16px;
111111+}
112112+```
113113+114114+## 4. Typography Rules
115115+116116+### Font Families
117117+118118+- **Title**: `SpotifyMixUITitle`, fallbacks: `CircularSp-Arab, CircularSp-Hebr, CircularSp-Cyrl, CircularSp-Grek, CircularSp-Deva, Helvetica Neue, helvetica, arial, Hiragino Sans, Hiragino Kaku Gothic ProN, Meiryo, MS Gothic`
119119+- **UI / Body**: `SpotifyMixUI`, same fallback stack
120120+121121+### Hierarchy
122122+123123+| Role | Font | Size | Weight | Line Height | Letter Spacing | Notes |
124124+| ---------------- | ----------------- | ---------------- | ------- | ------------ | -------------- | ---------------------------- |
125125+| Section Title | SpotifyMixUITitle | 24px (1.50rem) | 700 | normal | normal | Bold title weight |
126126+| Feature Heading | SpotifyMixUI | 18px (1.13rem) | 600 | 1.30 (tight) | normal | Semibold section heads |
127127+| Body Bold | SpotifyMixUI | 16px (1.00rem) | 700 | normal | normal | Emphasized text |
128128+| Body | SpotifyMixUI | 16px (1.00rem) | 400 | normal | normal | Standard body |
129129+| Button Uppercase | SpotifyMixUI | 14px (0.88rem) | 600–700 | 1.00 (tight) | 1.4px–2px | `text-transform: uppercase` |
130130+| Button | SpotifyMixUI | 14px (0.88rem) | 700 | normal | 0.14px | Standard button |
131131+| Nav Link Bold | SpotifyMixUI | 14px (0.88rem) | 700 | normal | normal | Navigation |
132132+| Nav Link | SpotifyMixUI | 14px (0.88rem) | 400 | normal | normal | Inactive nav |
133133+| Caption Bold | SpotifyMixUI | 14px (0.88rem) | 700 | 1.50–1.54 | normal | Bold metadata |
134134+| Caption | SpotifyMixUI | 14px (0.88rem) | 400 | normal | normal | Metadata |
135135+| Small Bold | SpotifyMixUI | 12px (0.75rem) | 700 | 1.50 | normal | Tags, counts |
136136+| Small | SpotifyMixUI | 12px (0.75rem) | 400 | normal | normal | Fine print |
137137+| Badge | SpotifyMixUI | 10.5px (0.66rem) | 600 | 1.33 | normal | `text-transform: capitalize` |
138138+| Micro | SpotifyMixUI | 10px (0.63rem) | 400 | normal | normal | Smallest text |
139139+140140+### Principles
141141+142142+- **Bold/regular binary**: Most text is either 700 (bold) or 400 (regular), with 600 used sparingly. This creates a clear visual hierarchy through weight contrast rather than size variation.
143143+- **Uppercase buttons as system**: Button labels use uppercase + wide letter-spacing (1.4px–2px), creating a systematic "label" voice distinct from content text.
144144+- **Compact sizing**: The range is 10px–24px — narrower than most systems. Spotify's type is compact and functional, designed for scanning playlists, not reading articles.
145145+- **Global script support**: The extensive fallback stack (Arabic, Hebrew, Cyrillic, Greek, Devanagari, CJK) reflects Spotify's 180+ market reach.
146146+147147+## 5. Component Stylings
148148+149149+### Buttons
150150+151151+**Accent Pill**
152152+153153+- Background: `#a855f7`
154154+- Text: `var(--spot-bg)` (dark in dark mode, light in light mode)
155155+- Padding: 8px 16px
156156+- Radius: 9999px (full pill)
157157+- Use: Primary CTAs, add buttons
158158+159159+**Dark Pill**
160160+161161+- Background: `var(--spot-hover)`
162162+- Text: `var(--spot-text)` or `var(--spot-secondary)`
163163+- Padding: 8px 16px
164164+- Radius: 9999px (full pill)
165165+- Use: Navigation pills, secondary actions
166166+167167+**Outlined Pill**
168168+169169+- Background: transparent
170170+- Text: `var(--spot-text)`
171171+- Border: `1px solid var(--spot-outline)`
172172+- Radius: 9999px
173173+- Use: Follow buttons, secondary actions
174174+175175+**Circular Play**
176176+177177+- Background: `#a855f7`
178178+- Text: `var(--spot-bg)`
179179+- Padding: 12px
180180+- Radius: 50% (circle)
181181+- Use: Play/pause controls
182182+183183+### Cards & Containers
184184+185185+- Background: `var(--spot-surface)`
186186+- Radius: 6px–8px
187187+- No visible borders on most cards
188188+- Hover: `var(--spot-hover-50)` background
189189+- Shadow: `var(--spot-shadow)` on elevated
190190+191191+### Inputs
192192+193193+- Background: `var(--spot-hover)`
194194+- Text: `var(--spot-text)`
195195+- Radius: 500px (pill)
196196+- Focus ring: `#a855f7`
197197+- Placeholder: `var(--spot-placeholder)`
198198+199199+### Navigation
200200+201201+- Sidebar: `var(--spot-bg)` background
202202+- Active items: 14px weight 700, `var(--spot-text)`
203203+- Inactive items: 14px weight 400, `var(--spot-secondary)`
204204+- Circular icon buttons (50% radius)
205205+- Brand logo top-left in purple
206206+207207+## 6. Layout Principles
208208+209209+### Spacing System
210210+211211+- Base unit: 8px
212212+- Scale: 1px, 2px, 3px, 4px, 5px, 6px, 8px, 10px, 12px, 14px, 15px, 16px, 20px
213213+214214+### Grid & Container
215215+216216+- Sidebar (fixed) + main content area
217217+- Grid-based album/playlist cards
218218+- Responsive content area fills remaining space
219219+220220+### Whitespace Philosophy
221221+222222+- **Dark compression**: Spotify packs content densely — playlist grids, track lists, and navigation are all tightly spaced. The dark background provides visual rest between elements without needing large gaps.
223223+- **Content density over breathing room**: This is an app, not a marketing site. Every pixel serves the listening experience.
224224+225225+### Border Radius Scale
226226+227227+- Minimal (2px): Badges, explicit tags
228228+- Subtle (4px): Inputs, small elements
229229+- Standard (6px): Album art containers, cards
230230+- Comfortable (8px): Sections, dialogs
231231+- Medium (10px–20px): Panels, overlay elements
232232+- Large (100px): Large pill buttons
233233+- Pill (500px): Primary buttons, search input
234234+- Full Pill (9999px): Navigation pills, search
235235+- Circle (50%): Play buttons, avatars, icons
236236+237237+## 7. Depth & Elevation
238238+239239+| Level | Treatment | Use |
240240+| ------------------ | ---------------------------- | ------------------------------ |
241241+| Base (Level 0) | `var(--spot-bg)` background | Deepest layer, page background |
242242+| Surface (Level 1) | `var(--spot-surface)` | Cards, sidebar, containers |
243243+| Elevated (Level 2) | `var(--spot-shadow)` | Dropdown menus, hover cards |
244244+| Dialog (Level 3) | `var(--spot-shadow-heavy)` | Modals, overlays, menus |
245245+246246+## 8. Do's and Don'ts
247247+248248+### Do
249249+250250+- Use semantic color tokens (`spot-bg`, `spot-surface`, `spot-text`, etc.) — they adapt to theme
251251+- Apply Accent Purple (`#a855f7`) only for play controls, active states, and primary CTAs
252252+- Use pill shape (500px–9999px) for all buttons — circular (50%) for play controls
253253+- Apply uppercase + wide letter-spacing (1.4px–2px) on button labels
254254+- Keep typography compact (10px–24px range) — this is an app, not a magazine
255255+- Use theme-aware shadows via CSS variables
256256+- Test all components in both dark and light themes
257257+258258+### Don't
259259+260260+- Don't use Accent Purple decoratively or on backgrounds — it's functional only
261261+- Don't hardcode theme-dependent colors — use CSS variable-backed tokens
262262+- Don't skip the pill/circle geometry on buttons — square buttons break the identity
263263+- Don't use hardcoded shadow values — use `shadow-spot` and `shadow-spot-heavy`
264264+- Don't add additional brand colors — purple + achromatic grays is the complete palette
265265+- Don't use `text-white` or `bg-white` directly — use `text-spot-text` and `bg-spot-active-pill-bg`
266266+- Don't expose raw gray borders — use `border-spot-divider` or `border-spot-outline`
267267+268268+## 9. Responsive Behavior
269269+270270+### Breakpoints
271271+272272+| Name | Width | Key Changes |
273273+| ------------- | ----------- | --------------------- |
274274+| Mobile Small | <425px | Compact mobile layout |
275275+| Mobile | 425–576px | Standard mobile |
276276+| Tablet | 576–768px | 2-column grid |
277277+| Tablet Large | 768–896px | Expanded layout |
278278+| Desktop Small | 896–1024px | Sidebar visible |
279279+| Desktop | 1024–1280px | Full desktop layout |
280280+| Large Desktop | >1280px | Expanded grid |
281281+282282+### Collapsing Strategy
283283+284284+- Sidebar: full → collapsed → hidden
285285+- Album grid: 5 columns → 3 → 2 → 1
286286+- Search: pill input maintained, width adjusts
287287+- Navigation: sidebar → bottom bar on mobile
288288+- Theme toggle: always accessible in sidebar footer / mobile header
289289+290290+## 10. Agent Prompt Guide
291291+292292+### Quick Color Reference
293293+294294+| Role | Token | Value (dark) |
295295+| -------------- | -------------------- | -------------- |
296296+| Background | `bg-spot-bg` | `#121212` |
297297+| Surface | `bg-spot-surface` | `#181818` |
298298+| Hover | `bg-spot-hover` | `#1f1f1f` |
299299+| Text primary | `text-spot-text` | `#ffffff` |
300300+| Text secondary | `text-spot-secondary`| `#b3b3b3` |
301301+| Text body | `text-spot-body` | `#cbcbcb` |
302302+| Text muted | `text-spot-muted` | `#4d4d4d` |
303303+| Accent | `text-spot-purple` | `#a855f7` |
304304+| Divider | `border-spot-divider`| `rgba(...)` |
305305+| Outline | `border-spot-outline`| `#7c7c7c` |
306306+| Error | `text-spot-red` | `#f3727f` |
307307+308308+### Iteration Guide
309309+310310+1. Use semantic tokens (`spot-bg`, `spot-surface`, `spot-text`) — they handle theme switching
311311+2. Accent Purple (`spot-purple`) for functional highlights only (active, CTA)
312312+3. Pill everything — 500px for large, 9999px for small, 50% for circular
313313+4. Uppercase + wide tracking on buttons — the systematic label voice
314314+5. Theme-aware shadows via `shadow-spot` and `shadow-spot-heavy`
315315+6. Never hardcode `text-white` or `bg-white` — use semantic tokens
+839
docs/specs.md
···11+# Glean - Design Document
22+33+## 1. Overview
44+55+Glean is a social RSS reader built on the AT Protocol. It operates as an **AppView** for the `at.glean.*` lexicon namespace: it indexes records from the relay firehose, serves XRPC query endpoints, and provides the web UI at [glean.at](https://glean.at).
66+77+Users store their RSS feed subscriptions as individual lexicon records on their PDS (one record per feed). Glean's AppView consumes the firehose, indexes those records, fetches the referenced RSS feeds, and serves both the reader UI and public XRPC APIs for the `at.glean.*` namespace.
88+99+The core idea: your RSS subscriptions are a strong signal about your interests. When enough people expose theirs, you can discover both **people** (who reads the same things) and **content** (what similar readers follow that you don't).
1010+1111+## 2. Stack
1212+1313+| Layer | Technology |
1414+| ---------------- | ---------------------------------- |
1515+| Backend | Go |
1616+| Database | SQLite (via `modernc.org/sqlite`) |
1717+| Frontend | htmx + TailwindCSS |
1818+| Auth | AT Protocol OAuth / DID resolution |
1919+| AT Protocol role | AppView for `at.glean.*` lexicons |
2020+| Data source | AT Relay firehose → SQLite index |
2121+2222+## 3. AT Protocol Lexicons
2323+2424+All user data lives on their PDS. The server does not own user data — it indexes and aggregates it.
2525+2626+### 3.1 `at.glean.subscription`
2727+2828+A single RSS feed subscription. One record per feed per user. Created automatically when a user subscribes to a feed. During onboarding, existing subscriptions can be bulk-imported from an OPML file — the OPML is parsed and individual subscription records are created.
2929+3030+```json
3131+{
3232+ "lexicon": 1,
3333+ "id": "at.glean.subscription",
3434+ "defs": {
3535+ "main": {
3636+ "type": "record",
3737+ "key": "tid",
3838+ "description": "A single RSS feed subscription.",
3939+ "record": {
4040+ "type": "object",
4141+ "required": ["feedUrl"],
4242+ "properties": {
4343+ "createdAt": { "type": "string", "format": "datetime" },
4444+ "feedUrl": { "type": "string" },
4545+ "title": { "type": "string" },
4646+ "category": { "type": "string" }
4747+ }
4848+ }
4949+ }
5050+ }
5151+}
5252+```
5353+5454+#### OPML Import/Export
5555+5656+OPML is **not** part of the lexicon. It is only used as a transport format:
5757+5858+- **Import (onboarding)**: User uploads an OPML file. Glean parses it, validates each feed URL, creates individual `at.glean.subscription` records on the user's PDS, and indexes them locally.
5959+- **Export (offboarding)**: Glean reads the user's `at.glean.subscription` records from their PDS and generates an OPML file for download. The user's repository remains the canonical source.
6060+6161+### 3.2 `at.glean.annotation`
6262+6363+Reading notes on an article: quote a passage, tag it, rate it, or write a note. A user can have many annotations per article. These are public records on the PDS — users can always share an annotation on Bluesky if they want discussion.
6464+6565+```json
6666+{
6767+ "lexicon": 1,
6868+ "id": "at.glean.annotation",
6969+ "defs": {
7070+ "main": {
7171+ "type": "record",
7272+ "key": "tid",
7373+ "description": "Reading note on a specific RSS article.",
7474+ "record": {
7575+ "type": "object",
7676+ "required": ["feedUrl", "articleUrl"],
7777+ "properties": {
7878+ "createdAt": { "type": "string", "format": "datetime" },
7979+ "feedUrl": { "type": "string" },
8080+ "articleUrl": { "type": "string" },
8181+ "quote": { "type": "string", "maxGraphemes": 5000 },
8282+ "note": { "type": "string", "maxGraphemes": 500 },
8383+ "tags": {
8484+ "type": "array",
8585+ "items": { "type": "string", "maxGraphemes": 50 },
8686+ "maxLength": 10
8787+ },
8888+ "rating": { "type": "integer", "minimum": 1, "maximum": 5 }
8989+ }
9090+ }
9191+ }
9292+ }
9393+}
9494+```
9595+9696+### 3.3 `at.glean.like`
9797+9898+A user likes an article. The liked feed surfaces popular articles and feeds into discovery. Likes also feed into the recommendation system.
9999+100100+```json
101101+{
102102+ "lexicon": 1,
103103+ "id": "at.glean.like",
104104+ "defs": {
105105+ "main": {
106106+ "type": "record",
107107+ "key": "tid",
108108+ "description": "Like an RSS article.",
109109+ "record": {
110110+ "type": "object",
111111+ "required": ["feedUrl", "articleUrl"],
112112+ "properties": {
113113+ "createdAt": { "type": "string", "format": "datetime" },
114114+ "feedUrl": { "type": "string" },
115115+ "articleUrl": { "type": "string" }
116116+ }
117117+ }
118118+ }
119119+ }
120120+}
121121+```
122122+123123+### 3.4 AppView Query Lexicons
124124+125125+As an AppView, Glean serves the following XRPC query endpoints. Other AT Protocol applications can call these to access indexed `at.glean.*` data without implementing their own indexer.
126126+127127+#### `at.glean.listSubscriptions`
128128+129129+List subscriptions from a repo, with optional filtering.
130130+131131+```
132132+Input:
133133+ repo: string (DID of the user)
134134+ category?: string
135135+ limit?: integer (default 50, max 100)
136136+ cursor?: string
137137+138138+Output:
139139+ cursor?: string
140140+ subscriptions: [{ uri, cid, value: at.glean.subscription#main, indexedAt }]
141141+```
142142+143143+#### `at.glean.listFeedLists`
144144+145145+List subscription lists from multiple repos, with optional filtering.
146146+147147+```
148148+Input:
149149+ actors?: string[] (filter by DIDs)
150150+ limit?: integer (default 50, max 100)
151151+ cursor?: string
152152+153153+Output:
154154+ cursor?: string
155155+ feeds: [{ did, subscriptionCount, subscriptions: [{ feedUrl, title, category }] }]
156156+```
157157+158158+#### `at.glean.listAnnotations`
159159+160160+List annotations for an article, a feed, or by a user.
161161+162162+```
163163+Input:
164164+ feedUrl?: string
165165+ articleUrl?: string
166166+ author?: string (DID)
167167+ limit?: integer (default 50, max 100)
168168+ cursor?: string
169169+170170+Output:
171171+ cursor?: string
172172+ annotations: [{ uri, cid, author: { did, handle }, value, indexedAt }]
173173+```
174174+175175+#### `at.glean.listLikes`
176176+177177+List liked articles, optionally filtered by user or feed.
178178+179179+```
180180+Input:
181181+ author?: string (DID)
182182+ feedUrl?: string
183183+ limit?: integer (default 50, max 100)
184184+ cursor?: string
185185+186186+Output:
187187+ cursor?: string
188188+ likes: [{ uri, cid, author: { did, handle }, value: at.glean.like#main, indexedAt }]
189189+```
190190+191191+#### `at.glean.getTrending`
192192+193193+Articles with the most likes, forming the community feed.
194194+195195+```
196196+Input:
197197+ limit?: integer (default 50, max 100)
198198+ cursor?: string
199199+ since?: string (datetime)
200200+201201+Output:
202202+ cursor?: string
203203+ articles: [{ feedUrl, articleUrl, title, likeCount, annotations: [...] }]
204204+```
205205+206206+#### `at.glean.getRecommendations`
207207+208208+Get feed recommendations for a user based on clustering.
209209+210210+```
211211+Input:
212212+ repo: string (DID of the user)
213213+ limit?: integer (default 20, max 50)
214214+215215+Output:
216216+ feeds: [{ feedUrl, title, siteUrl, description, subscriberCount, score }]
217217+ people: [{ did, handle, displayName, avatar, jaccard, commonFeeds }]
218218+```
219219+220220+### 3.5 AppView Firehose Consumption
221221+222222+Glean subscribes to the AT Relay firehose (`wss://bsky.network`) for all `at.glean.*` records:
223223+224224+```
225225+SUBSCRIBE collections: ["at.glean.subscription", "at.glean.annotation", "at.glean.like"]
226226+```
227227+228228+On each event:
229229+230230+- **create**: Insert record into local SQLite, update materialized counts
231231+- **delete**: Tombstone the record (soft delete to preserve foreign key integrity)
232232+- **update**: Replace the record's CID and value
233233+234234+The AppView does not handle writes. Users write records to their own PDS. Glean only reads them from the firehose.
235235+236236+## 4. RSS Reader
237237+238238+Glean is first and foremost an RSS reader. It fetches, parses, and stores articles from RSS, Atom, and JSON feeds so users can read them in a clean interface.
239239+240240+### 4.1 Feed Fetching
241241+242242+A background scheduler polls subscribed feeds at regular intervals.
243243+244244+```
245245+ ┌─────────────────────────┐
246246+ │ Feed Scheduler │
247247+ │ (background goroutine) │
248248+ └────────┬────────────────┘
249249+ │ every N minutes
250250+ ┌────────▼────────────────┐
251251+ │ Feed Fetcher │
252252+ │ │
253253+ │ 1. SELECT feeds where │
254254+ │ next_fetch <= now │
255255+ │ 2. Respect ETag/If-None-│
256256+ │ Match / Last-Modified│
257257+ │ 3. GET feed URL │
258258+ │ 4. Parse XML/JSON │
259259+ │ 5. Upsert articles │
260260+ │ 6. Update feed metadata│
261261+ └────────┬────────────────┘
262262+ │
263263+ ┌──────────────┼──────────────┐
264264+ │ │ │
265265+ ┌────▼────┐ ┌─────▼─────┐ ┌─────▼─────┐
266266+ │RSS/XML │ │Atom/XML │ │JSON Feed │
267267+ │Parser │ │Parser │ │Parser │
268268+ └─────────┘ └───────────┘ └───────────┘
269269+```
270270+271271+### 4.2 Fetch Schedule
272272+273273+Feeds are not all fetched at the same frequency. The scheduler adapts based on:
274274+275275+- **Base interval**: Default 30 minutes
276276+- **Feed-level override**: User can set per-feed refresh rate (15min / 30min / 1h / 3h / 6h / 12h / daily)
277277+- **Adaptive backoff**: If a feed has not published new articles in the last N fetches, increase the interval. If it starts publishing again, decrease back.
278278+- **HTTP cache**: Honor `ETag` and `Last-Modified` headers to skip parsing when nothing changed (304 Not Modified)
279279+- **Error backoff**: On failure, double the interval up to 24h, reset on success
280280+281281+```sql
282282+ALTER TABLE feeds ADD COLUMN fetch_interval_minutes INTEGER NOT NULL DEFAULT 30;
283283+ALTER TABLE feeds ADD COLUMN next_fetch_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP;
284284+ALTER TABLE feeds ADD COLUMN consecutive_empty_fetches INTEGER NOT NULL DEFAULT 0;
285285+ALTER TABLE feeds ADD COLUMN error_count INTEGER NOT NULL DEFAULT 0;
286286+```
287287+288288+### 4.3 Feed Parsing
289289+290290+Go's `encoding/xml` for RSS and Atom. A simple `encoding/json` for JSON Feed.
291291+292292+Each parser returns a normalized `Feed` and a slice of `Article` structs:
293293+294294+```go
295295+type Feed struct {
296296+ URL string
297297+ Title string
298298+ SiteURL string
299299+ Description string
300300+ Type string // "rss", "atom", "json"
301301+ ETag string
302302+ LastModified string
303303+}
304304+305305+type Article struct {
306306+ GUID string
307307+ Title string
308308+ URL string
309309+ Author string
310310+ Content string
311311+ Summary string
312312+ Published time.Time
313313+ Updated time.Time
314314+}
315315+```
316316+317317+Articles are deduplicated by `(feed_url, guid)`. On upsert, only the article metadata changes — reading state is preserved.
318318+319319+### 4.4 Article Content
320320+321321+Glean stores article content locally so the reading experience is fast and consistent:
322322+323323+```sql
324324+CREATE TABLE articles (
325325+ id INTEGER PRIMARY KEY AUTOINCREMENT,
326326+ feed_url TEXT NOT NULL REFERENCES feeds(feed_url),
327327+ guid TEXT NOT NULL,
328328+ title TEXT NOT NULL DEFAULT '',
329329+ url TEXT,
330330+ author TEXT,
331331+ summary TEXT,
332332+ content TEXT,
333333+ published DATETIME,
334334+ updated DATETIME,
335335+ fetched_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
336336+ UNIQUE(feed_url, guid)
337337+);
338338+```
339339+340340+Content is stored as raw HTML from the feed's `<content:encoded>`, `<summary>`, or JSON Feed `content_html`. The server renders it in a sanitized view (strip `<script>`, `<iframe>`, etc.).
341341+342342+### 4.5 Read State
343343+344344+Read/unread state is tracked per user per article:
345345+346346+```sql
347347+CREATE TABLE read_state (
348348+ user_did TEXT NOT NULL REFERENCES users(did),
349349+ article_id INTEGER NOT NULL REFERENCES articles(id),
350350+ is_read BOOLEAN NOT NULL DEFAULT 0,
351351+ read_at DATETIME,
352352+ PRIMARY KEY (user_did, article_id)
353353+);
354354+355355+CREATE INDEX idx_read_state_unread ON read_state(user_did, is_read) WHERE is_read = 0;
356356+```
357357+358358+### 4.6 Reading Experience
359359+360360+The `/articles` page is the main reading view:
361361+362362+- **River of news**: Chronological list of all unread articles across all subscriptions
363363+- **Feed filter**: Narrow to a single feed or category
364364+- **Mark as read**: Individual or "mark all above" / "mark all"
365365+- **Like**: Endorse an article (public, synced to Bluesky PDS)
366366+- **Open original**: Title links to the source article
367367+- **Share**: Share to Bluesky
368368+- **Keyboard navigation**: `j`/`k` to navigate, `l` to like, `m` to mark read (progressive enhancement via a small `<script>` block)
369369+370370+### 4.7 Feed Discovery from Content
371371+372372+Beyond the clustering system, Glean also discovers new feeds from article content:
373373+374374+- **Auto-discovery**: When fetching a feed, parse `<link rel="alternate" type="application/rss+xml">` from the feed's site URL to discover related feeds
375375+- **Feedfavicon**: Fetch `favicon.ico` or `/apple-touch-icon.png` from the feed's site URL for display
376376+- **Dead feed detection**: If a feed fails for 7 consecutive fetches (14 days at base interval), mark it as dead. Notify the user and offer to remove it.
377377+378378+## 5. System Architecture
379379+380380+Glean runs as a single Go binary that fills three roles: **AppView** (indexing `at.glean.*` records from the firehose, serving XRPC queries), **RSS reader** (fetching and storing feed content), and **web UI** (htmx frontend).
381381+382382+```
383383+ AT Relay (bsky.network)
384384+ │ firehose
385385+ ▼
386386+ ┌─────────────────────┐
387387+ │ Go Server (glean.at)│
388388+ │ │
389389+ Browser ──HTTP──► │ ┌────────────────┐ │ ──XRPC queries──► Other AT apps
390390+ (htmx + TW) │ │ Router │ │
391391+ │ │ ┌───────────┐ │ │
392392+ │ │ │ Handlers │ │ │
393393+ │ │ │ (UI + XRPC)│ │ │
394394+ │ │ └─────┬─────┘ │ │
395395+ │ └────────┼────────┘ │
396396+ │ │ │
397397+ │ ┌────────▼────────┐ │ ┌──────────────────┐
398398+ │ │ Service Layer │ │ │ Feed Scheduler │
399399+ │ │ │──┼──sync──►│ (goroutine) │
400400+ │ └────────┬────────┘ │ │ Fetcher + Parser│
401401+ │ │ │ └────────┬─────────┘
402402+ │ ┌────────▼────────┐ │ │
403403+ │ │ SQLite │ │ RSS/Atom/JSON feeds
404404+ │ │ (firehose idx, │ │
405405+ │ │ articles, │ │ ┌──────────────────┐
406406+ │ │ read state, │ │ │ Cluster Engine │
407407+ │ │ clustering) │◄─┼────────►│ (periodic cron) │
408408+ │ └─────────────────┘ │ └──────────────────┘
409409+ └──────────────────────┘
410410+411411+ AppView responsibilities:
412412+ • Subscribe to firehose for at.glean.subscription, at.glean.annotation, at.glean.like
413413+ • Index records into SQLite
414414+ • Serve XRPC query endpoints (at.glean.listSubscriptions, etc.)
415415+ • Host the web UI at glean.at
416416+ • Write to user PDS on behalf of user (when user acts through UI)
417417+```
418418+419419+## 6. Database Schema (SQLite)
420420+421421+### 6.1 Users
422422+423423+```sql
424424+CREATE TABLE users (
425425+ did TEXT PRIMARY KEY,
426426+ handle TEXT NOT NULL,
427427+ display_name TEXT,
428428+ avatar_url TEXT,
429429+ indexed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
430430+ updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
431431+);
432432+```
433433+434434+### 6.2 Feed Subscriptions
435435+436436+Indexed from `at.glean.subscription` records on user PDS.
437437+438438+```sql
439439+CREATE TABLE subscriptions (
440440+ id INTEGER PRIMARY KEY AUTOINCREMENT,
441441+ user_did TEXT NOT NULL REFERENCES users(did),
442442+ feed_url TEXT NOT NULL,
443443+ category TEXT,
444444+ added_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
445445+ UNIQUE(user_did, feed_url)
446446+);
447447+448448+CREATE INDEX idx_subscriptions_feed ON subscriptions(feed_url);
449449+CREATE INDEX idx_subscriptions_user ON subscriptions(user_did);
450450+```
451451+452452+### 6.3 Feeds
453453+454454+Master list of all known RSS feeds.
455455+456456+```sql
457457+CREATE TABLE feeds (
458458+ feed_url TEXT PRIMARY KEY,
459459+ title TEXT,
460460+ site_url TEXT,
461461+ description TEXT,
462462+ feed_type TEXT CHECK(feed_type IN ('rss', 'atom', 'json')),
463463+ last_fetched_at DATETIME,
464464+ last_error TEXT,
465465+ subscriber_count INTEGER NOT NULL DEFAULT 0
466466+);
467467+```
468468+469469+### 6.4 Articles
470470+471471+Fetched from RSS feeds. Only fetched for feeds that have local subscribers.
472472+473473+```sql
474474+CREATE TABLE articles (
475475+ id INTEGER PRIMARY KEY AUTOINCREMENT,
476476+ feed_url TEXT NOT NULL REFERENCES feeds(feed_url),
477477+ guid TEXT NOT NULL,
478478+ title TEXT,
479479+ url TEXT,
480480+ author TEXT,
481481+ published DATETIME,
482482+ fetched_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
483483+ UNIQUE(feed_url, guid)
484484+);
485485+486486+CREATE INDEX idx_articles_feed ON articles(feed_url);
487487+CREATE INDEX idx_articles_published ON articles(published DESC);
488488+```
489489+490490+### 6.5 Annotations, Likes
491491+492492+Local mirror of AT Protocol lexicon records for fast querying.
493493+494494+```sql
495495+CREATE TABLE annotations (
496496+ id INTEGER PRIMARY KEY AUTOINCREMENT,
497497+ uri TEXT NOT NULL UNIQUE,
498498+ author_did TEXT NOT NULL REFERENCES users(did),
499499+ feed_url TEXT NOT NULL,
500500+ article_url TEXT NOT NULL,
501501+ quote TEXT,
502502+ note TEXT,
503503+ tags TEXT,
504504+ rating INTEGER,
505505+ created_at DATETIME NOT NULL
506506+);
507507+508508+CREATE TABLE likes (
509509+ id INTEGER PRIMARY KEY AUTOINCREMENT,
510510+ uri TEXT NOT NULL UNIQUE,
511511+ author_did TEXT NOT NULL REFERENCES users(did),
512512+ feed_url TEXT NOT NULL,
513513+ article_url TEXT NOT NULL,
514514+ created_at DATETIME NOT NULL,
515515+ UNIQUE(author_did, feed_url, article_url)
516516+);
517517+```
518518+519519+### 6.6 Cluster Precomputation
520520+521521+Stores precomputed similarity data to avoid recalculating on every request.
522522+523523+```sql
524524+CREATE TABLE feed_similarity (
525525+ feed_a TEXT NOT NULL REFERENCES feeds(feed_url),
526526+ feed_b TEXT NOT NULL REFERENCES feeds(feed_url),
527527+ jaccard REAL NOT NULL,
528528+ computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
529529+ PRIMARY KEY (feed_a, feed_b),
530530+ CHECK(feed_a < feed_b)
531531+);
532532+533533+CREATE TABLE user_similarity (
534534+ user_a TEXT NOT NULL REFERENCES users(did),
535535+ user_b TEXT NOT NULL REFERENCES users(did),
536536+ jaccard REAL NOT NULL,
537537+ common_feeds INTEGER NOT NULL,
538538+ computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
539539+ PRIMARY KEY (user_a, user_b),
540540+ CHECK(user_a < user_b)
541541+);
542542+```
543543+544544+## 7. Clustering & Recommendations
545545+546546+Glean has two complementary recommendation signals:
547547+548548+- **Subscriptions** (Jaccard similarity): "Who reads the same feeds?" → feed and people discovery
549549+- **Likes** (co-occurrence): "Who likes the same articles?" → article and feed discovery
550550+551551+### 7.1 Feed Co-occurrence (Jaccard Similarity)
552552+553553+For any two feeds, the similarity is the Jaccard index of their subscriber sets:
554554+555555+```
556556+J(A, B) = |subscribers(A) ∩ subscribers(B)| / |subscribers(A) ∪ subscribers(B)|
557557+```
558558+559559+This is recomputed periodically (cron job) or incrementally when subscriptions change.
560560+561561+### 7.2 User Similarity
562562+563563+For any two users, compute Jaccard over their subscription sets:
564564+565565+```
566566+J(U1, U2) = |feeds(U1) ∩ feeds(U2)| / |feeds(U1) ∪ feeds(U2)|
567567+```
568568+569569+### 7.3 Recommendation Algorithms
570570+571571+**Feed recommendations (on glean.at):**
572572+573573+1. Find users with Jaccard > 0.2 (similar readers)
574574+2. Collect feeds those users subscribe to that the target user does not
575575+3. Rank by frequency (how many similar users subscribe) and average similarity
576576+4. Return top N feeds as recommendations
577577+578578+```
579579+score(feed) = Σ J(target, U) for each user U subscribed to feed
580580+```
581581+582582+**Article recommendations (on glean.at, from likes):**
583583+584584+1. Find users who liked articles that the target user also liked
585585+2. Collect articles those users liked that the target has not
586586+3. Rank by frequency and recency
587587+4. Return top N articles as recommendations
588588+589589+```
590590+score(article) = Σ 1/logN(likers(article)) for each user U who liked it
591591+```
592592+593593+The `1/logN` weighting avoids over-recommending articles from very large feeds.
594594+595595+**People recommendations (to follow on Bluesky):**
596596+597597+1. Compute user similarity for all pairs
598598+2. Return users with highest Jaccard, linking to their Bluesky profile for follow
599599+600600+### 7.4 Implementation
601601+602602+For the initial version, brute-force Jaccard with SQLite is sufficient (scale: ~10k users, ~100k subscriptions). The query is:
603603+604604+```sql
605605+SELECT s2.feed_url, COUNT(*) as overlap_count
606606+FROM subscriptions s1
607607+JOIN subscriptions s2 ON s1.feed_url = s2.feed_url
608608+WHERE s1.user_did = ? AND s2.user_did != ?
609609+AND s2.feed_url NOT IN (SELECT feed_url FROM subscriptions WHERE user_did = ?)
610610+GROUP BY s2.feed_url
611611+ORDER BY overlap_count DESC
612612+LIMIT 20;
613613+```
614614+615615+For larger scale, move to MinHash + LSH (banded hashing) to approximate Jaccard in sub-linear time.
616616+617617+### 7.5 Clustering Engine (Cron)
618618+619619+A background goroutine runs on a schedule (e.g., every 6 hours):
620620+621621+1. **Firehose ingestion**: Subscribe to AT Protocol firehose for `at.glean.*` records
622622+2. **Index new records**: Parse lexicon records, upsert into SQLite
623623+3. **Compute similarities**: Batch-update the `feed_similarity`, `user_similarity`, and `article_co_like` tables
624624+4. **Generate recommendations**: Materialize top recommendations per user into cache tables
625625+626626+```sql
627627+CREATE TABLE user_feed_recommendations (
628628+ user_did TEXT NOT NULL REFERENCES users(did),
629629+ feed_url TEXT NOT NULL REFERENCES feeds(feed_url),
630630+ score REAL NOT NULL,
631631+ computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
632632+ PRIMARY KEY (user_did, feed_url)
633633+);
634634+635635+CREATE TABLE user_article_recommendations (
636636+ user_did TEXT NOT NULL REFERENCES users(did),
637637+ feed_url TEXT NOT NULL,
638638+ article_url TEXT NOT NULL,
639639+ score REAL NOT NULL,
640640+ computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
641641+ PRIMARY KEY (user_did, feed_url, article_url)
642642+);
643643+```
644644+645645+## 8. HTTP API / htmx Endpoints
646646+647647+The server renders HTML fragments that htmx swaps into the page. No JSON API needed for the frontend.
648648+649649+### 8.1 Pages
650650+651651+| Route | Method | Description |
652652+| ---------------------- | ------ | -------------------------------------------------------- |
653653+| `/` | GET | Landing page / auth redirect |
654654+| `/dashboard` | GET | Main dashboard: unread articles, recommendations sidebar |
655655+| `/feeds` | GET | Manage RSS subscriptions (OPML import for onboarding) |
656656+| `/feeds/opml/upload` | POST | Upload OPML file to bulk-import subscriptions |
657657+| `/feeds/opml/download` | GET | Export subscriptions as OPML (offboarding) |
658658+| `/feeds/add` | POST | Add a single feed URL |
659659+| `/feeds/remove` | DELETE | Remove a feed |
660660+| `/articles` | GET | Read articles (paginated, filterable by feed) |
661661+| `/trending` | GET | Community feed: articles ranked by likes |
662662+| `/discover` | GET | Feed recommendations + similar people |
663663+| `/discover/feeds` | GET | Recommended feeds |
664664+| `/discover/people` | GET | People with similar reading habits |
665665+| `/profile/{did}` | GET | Public profile: their feeds, likes, annotations |
666666+| `/articles/{id}/like` | POST | Like an article (amplify into community feed) |
667667+| `/annotations` | GET | View your annotations |
668668+| `/annotations/create` | POST | Create annotation on an article |
669669+670670+### 8.2 htmx Patterns
671671+672672+- **Feed list**: `<div hx-get="/feeds/list" hx-trigger="load">` renders the subscription list as a fragment
673673+- **Infinite scroll articles**: `<div hx-get="/articles?page=2" hx-trigger="intersect">` for pagination
674674+- **Like button**: `<button hx-post="/articles/{id}/like" hx-swap="outerHTML">` self-updates the button state
675675+- **OPML upload**: `<form hx-post="/feeds/opml/upload" hx-encoding="multipart/form-data" hx-target="#feed-list">`
676676+677677+## 9. Project Structure
678678+679679+```
680680+glean/
681681+├── main.go # Entry point, wire everything
682682+├── go.mod
683683+├── go.sum
684684+├── internal/
685685+│ ├── atproto/
686686+│ │ ├── auth.go # DID resolution, OAuth flow
687687+│ │ ├── client.go # XRPC client (write to user PDS)
688688+│ │ ├── firehose.go # Subscribe to AT Relay firehose
689689+│ │ ├── lexicon.go # Lexicon record types + validation
690690+│ │ └── xrpc.go # XRPC query handlers (AppView endpoints)
691691+│ ├── db/
692692+│ │ ├── db.go # SQLite connection, migrations
693693+│ │ ├── user.go # User queries
694694+│ │ ├── feed.go # Feed + subscription queries
695695+│ │ ├── article.go # Article queries
696696+│ │ ├── social.go # Like, annotation queries
697697+│ │ └── cluster.go # Similarity + recommendation queries
698698+│ ├── feed/
699699+│ │ ├── parser.go # RSS/Atom/JSON feed parser
700700+│ │ ├── fetcher.go # Fetch and parse feeds from URLs
701701+│ │ └── opml.go # OPML import/export
702702+│ ├── cluster/
703703+│ │ ├── jaccard.go # Jaccard similarity computation
704704+│ │ ├── recommender.go # Feed + people recommendation logic
705705+│ │ └── cron.go # Background recomputation scheduler
706706+│ ├── server/
707707+│ │ ├── server.go # HTTP server, router setup
708708+│ │ ├── middleware.go # Auth, logging, CSRF middleware
709709+│ │ ├── session.go # Session management (cookie + DID)
710710+│ │ └── handlers/
711711+│ │ ├── dashboard.go
712712+│ │ ├── feeds.go
713713+│ │ ├── articles.go
714714+│ │ ├── trending.go
715715+│ │ ├── discover.go
716716+│ │ ├── profile.go
717717+│ │ ├── annotations.go
718718+│ │ └── auth.go
719719+│ └── tmpl/
720720+│ ├── base.html # Base template with htmx + Tailwind
721721+│ ├── partials/
722722+│ │ ├── feed-list.html
723723+│ │ ├── article-list.html
724724+│ │ ├── like-button.html
725725+│ │ ├── annotation-card.html
726726+│ │ ├── recommendation-card.html
727727+│ │ └── profile-card.html
728728+│ ├── dashboard.html
729729+│ ├── feeds.html
730730+│ ├── articles.html
731731+│ ├── trending.html
732732+│ ├── discover.html
733733+│ ├── profile.html
734734+│ └── annotations.html
735735+├── static/
736736+│ ├── input.css # Tailwind input
737737+│ └── output.css # Tailwind compiled output
738738+├── docs/
739739+│ └── design.md # This document
740740+└── tailwind.config.js
741741+```
742742+743743+## 10. Auth Flow
744744+745745+1. User visits `/`, clicks "Sign in with Bluesky" (or any AT Proto PDS)
746746+2. Server redirects to AT Protocol OAuth authorization endpoint
747747+3. User authorizes on their PDS
748748+4. PDS redirects back with authorization code
749749+5. Server exchanges code for access token + refresh token
750750+6. Server resolves DID and creates a session (cookie with encrypted DID)
751751+7. On each request, middleware decrypts session, loads user from DB (or creates if new)
752752+753753+The server never stores the user's AT Protocol password. It stores the session tokens for XRPC calls to the user's PDS (to read/write their lexicon records).
754754+755755+## 11. Data Flow
756756+757757+### 11.1 User Imports OPML
758758+759759+```
760760+Browser ──POST /feeds/opml/upload──► Server
761761+ │
762762+ ├─► Parse OPML, extract feed URLs
763763+ ├─► Fetch each feed, validate + store in `feeds` table
764764+ ├─► For each feed, create an `at.glean.subscription` record
765765+ │ via XRPC write to user's PDS
766766+ ├─► Insert subscriptions in local `subscriptions` table
767767+ └─◄ Return updated feed list fragment (htmx)
768768+```
769769+770770+### 11.2 Reading the Feed
771771+772772+```
773773+Browser ──GET /articles──► Server
774774+ │
775775+ ├─► Query articles for user's subscriptions
776776+ ├─► Render article-list.html partial
777777+ └─◄ Return HTML fragment (htmx)
778778+```
779779+780780+### 11.3 Recommendations
781781+782782+```
783783+Cron (every 6h) ──► Cluster Engine
784784+ │
785785+ ├─► SELECT user similarity pairs
786786+ ├─► Compute recommendation scores
787787+ └─► INSERT into user_feed_recommendations
788788+789789+Browser ──GET /discover/feeds──► Server
790790+ │
791791+ ├─► SELECT from user_feed_recommendations
792792+ ├─► Fetch feed metadata
793793+ └─◄ Render recommendation cards (htmx)
794794+```
795795+796796+## 12. Key Design Decisions
797797+798798+### 12.1 Why use lexicon records for feed subscriptions?
799799+800800+- **User sovereignty**: Each feed subscription lives as a record on the user's PDS. They can export them, move PDS, or revoke access at any time.
801801+- **Interoperability**: Any AT Protocol app can read the lexicon and integrate with Glean data.
802802+- **No lock-in**: If Glean shuts down, the user's data is intact on their PDS.
803803+- **OPML as transport only**: OPML is a common interchange format used solely for import (onboarding from existing readers) and export (offboarding). The canonical representation is always the lexicon — individual `at.glean.subscription` records, one per feed.
804804+805805+### 12.2 Why SQLite?
806806+807807+- Single-binary deployment, no external database dependency
808808+- More than sufficient for the expected scale (tens of thousands of users)
809809+- Go's `database/sql` interface makes it easy to swap later if needed
810810+- Matches the project's philosophy of simplicity
811811+812812+### 12.3 Why htmx?
813813+814814+- No JavaScript build pipeline
815815+- Server renders everything — simpler mental model
816816+- Progressive enhancement works naturally
817817+- Perfect fit for a read-centric application
818818+- TailwindCSS handles styling without writing custom CSS
819819+820820+### 12.4 AppView Architecture
821821+822822+Glean operates as an AT Protocol AppView. This means:
823823+824824+- **Read path**: All `at.glean.*` data is consumed from the Relay firehose, not by polling individual PDS instances. The firehose handler runs as a persistent goroutine, upserting records into SQLite as they arrive.
825825+- **Write path**: Users write records to their own PDS (via standard AT Protocol `com.atproto.repo.createRecord` / `deleteRecord`). Glean never stores user data directly — it only indexes what the firehose delivers.
826826+- **Query path**: Other AT Protocol apps can query Glean's XRPC endpoints to access indexed data (subscriptions, annotations, likes, recommendations) without building their own indexer.
827827+- **Trade-off**: Article content (fetched from RSS feeds) is local-only and not part of the AT Protocol layer. Only individual feed subscription records (`at.glean.subscription`) live on the PDS.
828828+829829+### 12.5 Privacy Model
830830+831831+All PDS records are public. There is no notion of private data on the AT Protocol — everything stored on the repo is visible. Users should be aware that annotations, subscriptions, and likes are all public records.
832832+833833+## 13. Future Considerations
834834+835835+- **MinHash/LSH**: Replace brute-force Jaccard when user count exceeds ~50k
836836+- **Full-text search**: Add FTS5 virtual table on articles for search
837837+- **Feed groups / reading lists**: Allow users to create curated lists (separate lexicon)
838838+- **Email digest**: Periodic email with top articles from subscribed feeds
839839+- **Multi-AppView scaling**: Distribute firehose consumption across multiple instances behind a load balancer
···11+package cluster
22+33+import (
44+ "context"
55+ "database/sql"
66+ "log/slog"
77+)
88+99+type Engine struct {
1010+ db *sql.DB
1111+ logger *slog.Logger
1212+}
1313+1414+func NewEngine(db *sql.DB, logger *slog.Logger) *Engine {
1515+ return &Engine{db: db, logger: logger}
1616+}
1717+1818+func (e *Engine) ComputeFeedSimilarity(ctx context.Context) error {
1919+ tx, err := e.db.BeginTx(ctx, nil)
2020+ if err != nil {
2121+ return err
2222+ }
2323+ defer func() { _ = tx.Rollback() }()
2424+2525+ if _, err := tx.ExecContext(ctx, `DELETE FROM feed_similarity`); err != nil {
2626+ return err
2727+ }
2828+2929+ _, err = tx.ExecContext(ctx, `
3030+ INSERT INTO feed_similarity (feed_a, feed_b, jaccard)
3131+ SELECT
3232+ s1.feed_url,
3333+ s2.feed_url,
3434+ CAST(COUNT(*) AS REAL) / (f1.subscriber_count + f2.subscriber_count - CAST(COUNT(*) AS REAL))
3535+ FROM subscriptions s1
3636+ JOIN subscriptions s2 ON s1.user_did = s2.user_did AND s1.feed_url < s2.feed_url
3737+ JOIN feeds f1 ON f1.feed_url = s1.feed_url
3838+ JOIN feeds f2 ON f2.feed_url = s2.feed_url
3939+ GROUP BY s1.feed_url, s2.feed_url
4040+ HAVING COUNT(*) > 0
4141+ `)
4242+ if err != nil {
4343+ return err
4444+ }
4545+4646+ e.logger.Info("feed similarity computed")
4747+ return tx.Commit()
4848+}
4949+5050+func (e *Engine) ComputeUserSimilarity(ctx context.Context) error {
5151+ tx, err := e.db.BeginTx(ctx, nil)
5252+ if err != nil {
5353+ return err
5454+ }
5555+ defer func() { _ = tx.Rollback() }()
5656+5757+ if _, err := tx.ExecContext(ctx, `DELETE FROM user_similarity`); err != nil {
5858+ return err
5959+ }
6060+6161+ _, err = tx.ExecContext(ctx, `
6262+ INSERT INTO user_similarity (user_a, user_b, jaccard, common_feeds)
6363+ SELECT
6464+ s1.user_did,
6565+ s2.user_did,
6666+ CAST(COUNT(*) AS REAL) / (
6767+ (SELECT COUNT(*) FROM subscriptions WHERE user_did = s1.user_did) +
6868+ (SELECT COUNT(*) FROM subscriptions WHERE user_did = s2.user_did) -
6969+ CAST(COUNT(*) AS REAL)
7070+ ),
7171+ COUNT(*)
7272+ FROM subscriptions s1
7373+ JOIN subscriptions s2 ON s1.feed_url = s2.feed_url AND s1.user_did < s2.user_did
7474+ GROUP BY s1.user_did, s2.user_did
7575+ HAVING COUNT(*) > 0
7676+ `)
7777+ if err != nil {
7878+ return err
7979+ }
8080+8181+ e.logger.Info("user similarity computed")
8282+ return tx.Commit()
8383+}
8484+8585+func (e *Engine) ComputeRecommendations(ctx context.Context) error {
8686+ tx, err := e.db.BeginTx(ctx, nil)
8787+ if err != nil {
8888+ return err
8989+ }
9090+ defer func() { _ = tx.Rollback() }()
9191+9292+ if _, err := tx.ExecContext(ctx, `DELETE FROM user_feed_recommendations`); err != nil {
9393+ return err
9494+ }
9595+9696+ _, err = tx.ExecContext(ctx, `
9797+ INSERT INTO user_feed_recommendations (user_did, feed_url, score)
9898+ SELECT target, feed_url, SUM(jaccard) AS score
9999+ FROM (
100100+ SELECT us.user_a AS target, s.feed_url, us.jaccard
101101+ FROM user_similarity us
102102+ JOIN subscriptions s ON s.user_did = us.user_b
103103+ WHERE us.jaccard > 0.2
104104+ AND s.feed_url NOT IN (SELECT feed_url FROM subscriptions WHERE user_did = us.user_a)
105105+106106+ UNION ALL
107107+108108+ SELECT us.user_b AS target, s.feed_url, us.jaccard
109109+ FROM user_similarity us
110110+ JOIN subscriptions s ON s.user_did = us.user_a
111111+ WHERE us.jaccard > 0.2
112112+ AND s.feed_url NOT IN (SELECT feed_url FROM subscriptions WHERE user_did = us.user_b)
113113+ )
114114+ GROUP BY target, feed_url
115115+ ORDER BY score DESC
116116+ `)
117117+ if err != nil {
118118+ return err
119119+ }
120120+121121+ e.logger.Info("feed recommendations computed")
122122+ return tx.Commit()
123123+}
+111
internal/cluster/recommender.go
···11+package cluster
22+33+import (
44+ "context"
55+)
66+77+func (e *Engine) GetFeedRecommendations(ctx context.Context, userDID string, limit int) ([]map[string]any, error) {
88+ rows, err := e.db.QueryContext(ctx, `
99+ SELECT r.feed_url, f.title, f.site_url, f.description, f.feed_type, r.score
1010+ FROM user_feed_recommendations r
1111+ JOIN feeds f ON f.feed_url = r.feed_url
1212+ WHERE r.user_did = ?
1313+ ORDER BY r.score DESC
1414+ LIMIT ?
1515+ `, userDID, limit)
1616+ if err != nil {
1717+ return nil, err
1818+ }
1919+ defer rows.Close()
2020+2121+ var results []map[string]any
2222+ for rows.Next() {
2323+ var feedURL, title, siteURL, description, feedType string
2424+ var score float64
2525+ if err := rows.Scan(&feedURL, &title, &siteURL, &description, &feedType, &score); err != nil {
2626+ return nil, err
2727+ }
2828+ results = append(results, map[string]any{
2929+ "feed_url": feedURL,
3030+ "title": title,
3131+ "site_url": siteURL,
3232+ "description": description,
3333+ "feed_type": feedType,
3434+ "score": score,
3535+ })
3636+ }
3737+ return results, rows.Err()
3838+}
3939+4040+func (e *Engine) GetPeopleRecommendations(ctx context.Context, userDID string, limit int) ([]map[string]any, error) {
4141+ rows, err := e.db.QueryContext(ctx, `
4242+ SELECT u.did, u.handle, u.display_name, u.avatar_url, sim.jaccard, sim.common_feeds
4343+ FROM (
4444+ SELECT user_b AS peer_did, jaccard, common_feeds FROM user_similarity WHERE user_a = ?
4545+ UNION ALL
4646+ SELECT user_a AS peer_did, jaccard, common_feeds FROM user_similarity WHERE user_b = ?
4747+ ) sim
4848+ JOIN users u ON u.did = sim.peer_did
4949+ ORDER BY sim.jaccard DESC
5050+ LIMIT ?
5151+ `, userDID, userDID, limit)
5252+ if err != nil {
5353+ return nil, err
5454+ }
5555+ defer rows.Close()
5656+5757+ var results []map[string]any
5858+ for rows.Next() {
5959+ var did, handle, displayName, avatarURL string
6060+ var jaccard float64
6161+ var commonFeeds int
6262+ if err := rows.Scan(&did, &handle, &displayName, &avatarURL, &jaccard, &commonFeeds); err != nil {
6363+ return nil, err
6464+ }
6565+ results = append(results, map[string]any{
6666+ "did": did,
6767+ "handle": handle,
6868+ "display_name": displayName,
6969+ "avatar_url": avatarURL,
7070+ "jaccard": jaccard,
7171+ "common_feeds": commonFeeds,
7272+ })
7373+ }
7474+ return results, rows.Err()
7575+}
7676+7777+func (e *Engine) GetSimilarFeeds(ctx context.Context, feedURL string, limit int) ([]map[string]any, error) {
7878+ rows, err := e.db.QueryContext(ctx, `
7979+ SELECT f.feed_url, f.title, f.site_url, f.description, f.feed_type, sim.jaccard
8080+ FROM (
8181+ SELECT feed_b AS peer_url, jaccard FROM feed_similarity WHERE feed_a = ?
8282+ UNION ALL
8383+ SELECT feed_a AS peer_url, jaccard FROM feed_similarity WHERE feed_b = ?
8484+ ) sim
8585+ JOIN feeds f ON f.feed_url = sim.peer_url
8686+ ORDER BY sim.jaccard DESC
8787+ LIMIT ?
8888+ `, feedURL, feedURL, limit)
8989+ if err != nil {
9090+ return nil, err
9191+ }
9292+ defer rows.Close()
9393+9494+ var results []map[string]any
9595+ for rows.Next() {
9696+ var peerURL, title, siteURL, description, feedType string
9797+ var jaccard float64
9898+ if err := rows.Scan(&peerURL, &title, &siteURL, &description, &feedType, &jaccard); err != nil {
9999+ return nil, err
100100+ }
101101+ results = append(results, map[string]any{
102102+ "feed_url": peerURL,
103103+ "title": title,
104104+ "site_url": siteURL,
105105+ "description": description,
106106+ "feed_type": feedType,
107107+ "jaccard": jaccard,
108108+ })
109109+ }
110110+ return results, rows.Err()
111111+}
+266
internal/db/article.go
···11+package db
22+33+import (
44+ "context"
55+ "database/sql"
66+ "fmt"
77+)
88+99+type Article struct {
1010+ ID int64
1111+ FeedURL string
1212+ FeedTitle string
1313+ GUID string
1414+ Title string
1515+ URL sql.NullString
1616+ Author sql.NullString
1717+ Summary sql.NullString
1818+ Content sql.NullString
1919+ Published sql.NullTime
2020+ Updated sql.NullTime
2121+ FetchedAt sql.NullTime
2222+ IsRead sql.NullBool
2323+ IsStarred sql.NullBool
2424+}
2525+2626+type ReadState struct {
2727+ UserDID string
2828+ ArticleID int64
2929+ IsRead bool
3030+ ReadAt sql.NullTime
3131+ IsStarred bool
3232+ StarredAt sql.NullTime
3333+}
3434+3535+func (db *DB) UpsertArticle(ctx context.Context, article *Article) (int64, error) {
3636+ var id int64
3737+ err := db.QueryRowContext(ctx, `
3838+ INSERT INTO articles (feed_url, guid, title, url, author, summary, content, published, updated)
3939+ VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
4040+ ON CONFLICT(feed_url, guid) DO NOTHING
4141+ RETURNING id
4242+ `, article.FeedURL, article.GUID, article.Title, article.URL, article.Author,
4343+ article.Summary, article.Content, article.Published, article.Updated).Scan(&id)
4444+ if err == sql.ErrNoRows {
4545+ err = db.QueryRowContext(ctx, `
4646+ SELECT id FROM articles WHERE feed_url = ? AND guid = ?
4747+ `, article.FeedURL, article.GUID).Scan(&id)
4848+ }
4949+ return id, err
5050+}
5151+5252+func (db *DB) GetArticle(ctx context.Context, id int64) (*Article, error) {
5353+ a := &Article{}
5454+ err := db.QueryRowContext(ctx, `
5555+ SELECT id, feed_url, guid, title, url, author, summary, content, published, updated, fetched_at
5656+ FROM articles WHERE id = ?
5757+ `, id).Scan(&a.ID, &a.FeedURL, &a.GUID, &a.Title, &a.URL, &a.Author,
5858+ &a.Summary, &a.Content, &a.Published, &a.Updated, &a.FetchedAt)
5959+ if err != nil {
6060+ return nil, err
6161+ }
6262+ return a, nil
6363+}
6464+6565+func (db *DB) ListArticles(ctx context.Context, userDID, feedURL string, limit, offset int) ([]*Article, error) {
6666+ query := `
6767+ SELECT a.id, a.feed_url, COALESCE(f.title, ''), a.guid, a.title, a.url, a.author, a.summary, a.content,
6868+ a.published, a.updated, a.fetched_at,
6969+ COALESCE(r.is_read, 0), COALESCE(r.is_starred, 0)
7070+ FROM articles a
7171+ JOIN subscriptions s ON a.feed_url = s.feed_url AND s.user_did = ?
7272+ LEFT JOIN feeds f ON a.feed_url = f.feed_url
7373+ LEFT JOIN read_state r ON r.user_did = ? AND r.article_id = a.id
7474+ `
7575+ args := []any{userDID, userDID}
7676+7777+ if feedURL != "" {
7878+ query += ` AND s.feed_url = ?`
7979+ args = append(args, feedURL)
8080+ }
8181+8282+ query += fmt.Sprintf(` ORDER BY a.published DESC LIMIT %d OFFSET %d`, limit, offset)
8383+8484+ rows, err := db.QueryContext(ctx, query, args...)
8585+ if err != nil {
8686+ return nil, err
8787+ }
8888+ defer rows.Close()
8989+9090+ var articles []*Article
9191+ for rows.Next() {
9292+ a := &Article{}
9393+ if err := rows.Scan(&a.ID, &a.FeedURL, &a.FeedTitle, &a.GUID, &a.Title, &a.URL, &a.Author,
9494+ &a.Summary, &a.Content, &a.Published, &a.Updated, &a.FetchedAt,
9595+ &a.IsRead, &a.IsStarred); err != nil {
9696+ return nil, err
9797+ }
9898+ articles = append(articles, a)
9999+ }
100100+ return articles, rows.Err()
101101+}
102102+103103+func (db *DB) ListUnreadArticles(ctx context.Context, userDID, feedURL string, limit, offset int) ([]*Article, error) {
104104+ query := `
105105+ SELECT a.id, a.feed_url, COALESCE(f.title, ''), a.guid, a.title, a.url, a.author, a.summary, a.content,
106106+ a.published, a.updated, a.fetched_at
107107+ FROM articles a
108108+ JOIN subscriptions s ON a.feed_url = s.feed_url AND s.user_did = ?
109109+ LEFT JOIN feeds f ON a.feed_url = f.feed_url
110110+ LEFT JOIN read_state r ON r.user_did = ? AND r.article_id = a.id
111111+ WHERE r.is_read = 0 OR r.is_read IS NULL
112112+ `
113113+ args := []any{userDID, userDID}
114114+115115+ if feedURL != "" {
116116+ query += ` AND a.feed_url = ?`
117117+ args = append(args, feedURL)
118118+ }
119119+120120+ query += fmt.Sprintf(` ORDER BY a.published DESC LIMIT %d OFFSET %d`, limit, offset)
121121+122122+ rows, err := db.QueryContext(ctx, query, args...)
123123+ if err != nil {
124124+ return nil, err
125125+ }
126126+ defer rows.Close()
127127+128128+ var articles []*Article
129129+ for rows.Next() {
130130+ a := &Article{}
131131+ if err := rows.Scan(&a.ID, &a.FeedURL, &a.FeedTitle, &a.GUID, &a.Title, &a.URL, &a.Author,
132132+ &a.Summary, &a.Content, &a.Published, &a.Updated, &a.FetchedAt); err != nil {
133133+ return nil, err
134134+ }
135135+ articles = append(articles, a)
136136+ }
137137+ return articles, rows.Err()
138138+}
139139+140140+func (db *DB) ListStarredArticles(ctx context.Context, userDID string, limit, offset int) ([]*Article, error) {
141141+ rows, err := db.QueryContext(ctx, fmt.Sprintf(`
142142+ SELECT a.id, a.feed_url, COALESCE(f.title, ''), a.guid, a.title, a.url, a.author, a.summary, a.content,
143143+ a.published, a.updated, a.fetched_at
144144+ FROM articles a
145145+ JOIN read_state r ON r.article_id = a.id AND r.user_did = ?
146146+ LEFT JOIN feeds f ON a.feed_url = f.feed_url
147147+ WHERE r.is_starred = 1
148148+ ORDER BY r.starred_at DESC
149149+ LIMIT %d OFFSET %d
150150+ `, limit, offset), userDID)
151151+ if err != nil {
152152+ return nil, err
153153+ }
154154+ defer rows.Close()
155155+156156+ var articles []*Article
157157+ for rows.Next() {
158158+ a := &Article{}
159159+ if err := rows.Scan(&a.ID, &a.FeedURL, &a.FeedTitle, &a.GUID, &a.Title, &a.URL, &a.Author,
160160+ &a.Summary, &a.Content, &a.Published, &a.Updated, &a.FetchedAt); err != nil {
161161+ return nil, err
162162+ }
163163+ articles = append(articles, a)
164164+ }
165165+ return articles, rows.Err()
166166+}
167167+168168+func (db *DB) MarkArticleRead(ctx context.Context, userDID string, articleID int64) error {
169169+ _, err := db.ExecContext(ctx, `
170170+ INSERT INTO read_state (user_did, article_id, is_read, read_at)
171171+ VALUES (?, ?, 1, CURRENT_TIMESTAMP)
172172+ ON CONFLICT(user_did, article_id) DO UPDATE SET
173173+ is_read = 1, read_at = CURRENT_TIMESTAMP
174174+ `, userDID, articleID)
175175+ return err
176176+}
177177+178178+func (db *DB) MarkArticleUnread(ctx context.Context, userDID string, articleID int64) error {
179179+ _, err := db.ExecContext(ctx, `
180180+ INSERT INTO read_state (user_did, article_id, is_read)
181181+ VALUES (?, ?, 0)
182182+ ON CONFLICT(user_did, article_id) DO UPDATE SET
183183+ is_read = 0, read_at = NULL
184184+ `, userDID, articleID)
185185+ return err
186186+}
187187+188188+func (db *DB) MarkAllRead(ctx context.Context, userDID, feedURL string) error {
189189+ _, err := db.ExecContext(ctx, `
190190+ INSERT INTO read_state (user_did, article_id, is_read, read_at)
191191+ SELECT ?, a.id, 1, CURRENT_TIMESTAMP
192192+ FROM articles a
193193+ JOIN subscriptions s ON a.feed_url = s.feed_url AND s.user_did = ?
194194+ WHERE a.feed_url = ?
195195+ ON CONFLICT(user_did, article_id) DO UPDATE SET
196196+ is_read = 1, read_at = CURRENT_TIMESTAMP
197197+ `, userDID, userDID, feedURL)
198198+ return err
199199+}
200200+201201+func (db *DB) MarkAllSubscribedRead(ctx context.Context, userDID string) error {
202202+ _, err := db.ExecContext(ctx, `
203203+ INSERT INTO read_state (user_did, article_id, is_read, read_at)
204204+ SELECT ?, a.id, 1, CURRENT_TIMESTAMP
205205+ FROM articles a
206206+ JOIN subscriptions s ON a.feed_url = s.feed_url AND s.user_did = ?
207207+ ON CONFLICT(user_did, article_id) DO UPDATE SET
208208+ is_read = 1, read_at = CURRENT_TIMESTAMP
209209+ `, userDID, userDID)
210210+ return err
211211+}
212212+213213+func (db *DB) StarArticle(ctx context.Context, userDID string, articleID int64) error {
214214+ _, err := db.ExecContext(ctx, `
215215+ INSERT INTO read_state (user_did, article_id, is_starred, starred_at)
216216+ VALUES (?, ?, 1, CURRENT_TIMESTAMP)
217217+ ON CONFLICT(user_did, article_id) DO UPDATE SET
218218+ is_starred = 1, starred_at = CURRENT_TIMESTAMP
219219+ `, userDID, articleID)
220220+ return err
221221+}
222222+223223+func (db *DB) UnstarArticle(ctx context.Context, userDID string, articleID int64) error {
224224+ _, err := db.ExecContext(ctx, `
225225+ UPDATE read_state SET is_starred = 0, starred_at = NULL
226226+ WHERE user_did = ? AND article_id = ?
227227+ `, userDID, articleID)
228228+ return err
229229+}
230230+231231+func (db *DB) GetReadState(ctx context.Context, userDID string, articleID int64) (*ReadState, error) {
232232+ rs := &ReadState{}
233233+ err := db.QueryRowContext(ctx, `
234234+ SELECT user_did, article_id, is_read, read_at, is_starred, starred_at
235235+ FROM read_state WHERE user_did = ? AND article_id = ?
236236+ `, userDID, articleID).Scan(&rs.UserDID, &rs.ArticleID, &rs.IsRead, &rs.ReadAt, &rs.IsStarred, &rs.StarredAt)
237237+ if err == sql.ErrNoRows {
238238+ return &ReadState{UserDID: userDID, ArticleID: articleID}, nil
239239+ }
240240+ if err != nil {
241241+ return nil, err
242242+ }
243243+ return rs, nil
244244+}
245245+246246+func (db *DB) GetUnreadCount(ctx context.Context, userDID, feedURL string) (int, error) {
247247+ var count int
248248+ if feedURL != "" {
249249+ err := db.QueryRowContext(ctx, `
250250+ SELECT COUNT(*)
251251+ FROM articles a
252252+ JOIN subscriptions s ON a.feed_url = s.feed_url AND s.user_did = ?
253253+ LEFT JOIN read_state r ON r.user_did = ? AND r.article_id = a.id
254254+ WHERE a.feed_url = ? AND (r.is_read = 0 OR r.is_read IS NULL)
255255+ `, userDID, userDID, feedURL).Scan(&count)
256256+ return count, err
257257+ }
258258+ err := db.QueryRowContext(ctx, `
259259+ SELECT COUNT(*)
260260+ FROM articles a
261261+ JOIN subscriptions s ON a.feed_url = s.feed_url AND s.user_did = ?
262262+ LEFT JOIN read_state r ON r.user_did = ? AND r.article_id = a.id
263263+ WHERE r.is_read = 0 OR r.is_read IS NULL
264264+ `, userDID, userDID).Scan(&count)
265265+ return count, err
266266+}
+306
internal/db/cluster.go
···11+package db
22+33+import (
44+ "context"
55+ "database/sql"
66+ "fmt"
77+)
88+99+func (db *DB) ComputeFeedSimilarity(ctx context.Context) error {
1010+ _, err := db.ExecContext(ctx, `DELETE FROM feed_similarity`)
1111+ if err != nil {
1212+ return err
1313+ }
1414+1515+ rows, err := db.QueryContext(ctx, `
1616+ SELECT s1.feed_url, s2.feed_url, COUNT(*) AS overlap
1717+ FROM subscriptions s1
1818+ JOIN subscriptions s2 ON s1.user_did = s2.user_did AND s1.feed_url < s2.feed_url
1919+ GROUP BY s1.feed_url, s2.feed_url
2020+ HAVING overlap > 0
2121+ `)
2222+ if err != nil {
2323+ return err
2424+ }
2525+ defer rows.Close()
2626+2727+ type pair struct {
2828+ feedA string
2929+ feedB string
3030+ overlap int
3131+ }
3232+ var pairs []pair
3333+ for rows.Next() {
3434+ var p pair
3535+ if err := rows.Scan(&p.feedA, &p.feedB, &p.overlap); err != nil {
3636+ return err
3737+ }
3838+ pairs = append(pairs, p)
3939+ }
4040+ if err := rows.Err(); err != nil {
4141+ return err
4242+ }
4343+4444+ subCounts := make(map[string]int)
4545+ for _, p := range pairs {
4646+ subCounts[p.feedA] = 0
4747+ subCounts[p.feedB] = 0
4848+ }
4949+5050+ if len(subCounts) > 0 {
5151+ countRows, err := db.QueryContext(ctx, `
5252+ SELECT feed_url, COUNT(*) FROM subscriptions GROUP BY feed_url
5353+ `)
5454+ if err != nil {
5555+ return err
5656+ }
5757+ for countRows.Next() {
5858+ var feedURL string
5959+ var count int
6060+ if err := countRows.Scan(&feedURL, &count); err != nil {
6161+ countRows.Close()
6262+ return err
6363+ }
6464+ subCounts[feedURL] = count
6565+ }
6666+ countRows.Close()
6767+ }
6868+6969+ for _, p := range pairs {
7070+ total := subCounts[p.feedA] + subCounts[p.feedB] - p.overlap
7171+ if total == 0 {
7272+ continue
7373+ }
7474+ jaccard := float64(p.overlap) / float64(total)
7575+ _, err := db.ExecContext(ctx, `
7676+ INSERT INTO feed_similarity (feed_a, feed_b, jaccard, computed_at)
7777+ VALUES (?, ?, ?, CURRENT_TIMESTAMP)
7878+ `, p.feedA, p.feedB, jaccard)
7979+ if err != nil {
8080+ return err
8181+ }
8282+ }
8383+8484+ return nil
8585+}
8686+8787+func (db *DB) ComputeUserSimilarity(ctx context.Context) error {
8888+ _, err := db.ExecContext(ctx, `DELETE FROM user_similarity`)
8989+ if err != nil {
9090+ return err
9191+ }
9292+9393+ rows, err := db.QueryContext(ctx, `
9494+ SELECT s1.user_did, s2.user_did, COUNT(*) AS common
9595+ FROM subscriptions s1
9696+ JOIN subscriptions s2 ON s1.user_did < s2.user_did AND s1.feed_url = s2.feed_url
9797+ GROUP BY s1.user_did, s2.user_did
9898+ HAVING common > 0
9999+ `)
100100+ if err != nil {
101101+ return err
102102+ }
103103+ defer rows.Close()
104104+105105+ type pair struct {
106106+ userA string
107107+ userB string
108108+ common int
109109+ }
110110+ var pairs []pair
111111+ for rows.Next() {
112112+ var p pair
113113+ if err := rows.Scan(&p.userA, &p.userB, &p.common); err != nil {
114114+ return err
115115+ }
116116+ pairs = append(pairs, p)
117117+ }
118118+ if err := rows.Err(); err != nil {
119119+ return err
120120+ }
121121+122122+ subCounts := make(map[string]int)
123123+ for _, p := range pairs {
124124+ subCounts[p.userA] = 0
125125+ subCounts[p.userB] = 0
126126+ }
127127+128128+ if len(subCounts) > 0 {
129129+ countRows, err := db.QueryContext(ctx, `
130130+ SELECT user_did, COUNT(*) FROM subscriptions GROUP BY user_did
131131+ `)
132132+ if err != nil {
133133+ return err
134134+ }
135135+ for countRows.Next() {
136136+ var userDID string
137137+ var count int
138138+ if err := countRows.Scan(&userDID, &count); err != nil {
139139+ countRows.Close()
140140+ return err
141141+ }
142142+ subCounts[userDID] = count
143143+ }
144144+ countRows.Close()
145145+ }
146146+147147+ for _, p := range pairs {
148148+ total := subCounts[p.userA] + subCounts[p.userB] - p.common
149149+ if total == 0 {
150150+ continue
151151+ }
152152+ jaccard := float64(p.common) / float64(total)
153153+ _, err := db.ExecContext(ctx, `
154154+ INSERT INTO user_similarity (user_a, user_b, jaccard, common_feeds, computed_at)
155155+ VALUES (?, ?, ?, ?, CURRENT_TIMESTAMP)
156156+ `, p.userA, p.userB, jaccard, p.common)
157157+ if err != nil {
158158+ return err
159159+ }
160160+ }
161161+162162+ return nil
163163+}
164164+165165+func (db *DB) ComputeFeedRecommendations(ctx context.Context, userDID string) error {
166166+ _, err := db.ExecContext(ctx, `
167167+ DELETE FROM user_feed_recommendations WHERE user_did = ?
168168+ `, userDID)
169169+ if err != nil {
170170+ return err
171171+ }
172172+173173+ rows, err := db.QueryContext(ctx, `
174174+ SELECT
175175+ CASE WHEN fs.feed_a IN (SELECT feed_url FROM subscriptions WHERE user_did = ?) THEN fs.feed_b ELSE fs.feed_a END AS recommended_feed,
176176+ SUM(fs.jaccard) AS score
177177+ FROM feed_similarity fs
178178+ WHERE fs.feed_a IN (SELECT feed_url FROM subscriptions WHERE user_did = ?)
179179+ OR fs.feed_b IN (SELECT feed_url FROM subscriptions WHERE user_did = ?)
180180+ GROUP BY recommended_feed
181181+ HAVING recommended_feed NOT IN (SELECT feed_url FROM subscriptions WHERE user_did = ?)
182182+ ORDER BY score DESC
183183+ `, userDID, userDID, userDID, userDID)
184184+ if err != nil {
185185+ return err
186186+ }
187187+ defer rows.Close()
188188+189189+ for rows.Next() {
190190+ var feedURL string
191191+ var score float64
192192+ if err := rows.Scan(&feedURL, &score); err != nil {
193193+ return err
194194+ }
195195+ _, err := db.ExecContext(ctx, `
196196+ INSERT INTO user_feed_recommendations (user_did, feed_url, score, computed_at)
197197+ VALUES (?, ?, ?, CURRENT_TIMESTAMP)
198198+ `, userDID, feedURL, score)
199199+ if err != nil {
200200+ return err
201201+ }
202202+ }
203203+ return rows.Err()
204204+}
205205+206206+func (db *DB) GetFeedRecommendations(ctx context.Context, userDID string, limit int) ([]map[string]any, error) {
207207+ rows, err := db.QueryContext(ctx, `
208208+ SELECT r.feed_url, r.score, f.title, f.site_url, f.description, f.subscriber_count
209209+ FROM user_feed_recommendations r
210210+ JOIN feeds f ON f.feed_url = r.feed_url
211211+ WHERE r.user_did = ?
212212+ ORDER BY r.score DESC
213213+ LIMIT ?
214214+ `, userDID, limit)
215215+ if err != nil {
216216+ return nil, err
217217+ }
218218+ defer rows.Close()
219219+220220+ var results []map[string]any
221221+ for rows.Next() {
222222+ var feedURL string
223223+ var score float64
224224+ var title, siteURL, description sql.NullString
225225+ var subCount int
226226+ if err := rows.Scan(&feedURL, &score, &title, &siteURL, &description, &subCount); err != nil {
227227+ return nil, err
228228+ }
229229+ results = append(results, map[string]any{
230230+ "feed_url": feedURL,
231231+ "score": score,
232232+ "title": title,
233233+ "site_url": siteURL,
234234+ "description": description,
235235+ "subscriber_count": subCount,
236236+ })
237237+ }
238238+ return results, rows.Err()
239239+}
240240+241241+func (db *DB) GetPeopleRecommendations(ctx context.Context, userDID string, limit int) ([]map[string]any, error) {
242242+ rows, err := db.QueryContext(ctx, fmt.Sprintf(`
243243+ SELECT
244244+ CASE WHEN us.user_a = ? THEN us.user_b ELSE us.user_a END AS recommended_user,
245245+ us.jaccard, us.common_feeds,
246246+ u.handle, u.display_name, u.avatar_url
247247+ FROM user_similarity us
248248+ JOIN users u ON u.did = CASE WHEN us.user_a = ? THEN us.user_b ELSE us.user_a END
249249+ WHERE us.user_a = ? OR us.user_b = ?
250250+ ORDER BY us.jaccard DESC
251251+ LIMIT %d
252252+ `, limit), userDID, userDID, userDID, userDID)
253253+ if err != nil {
254254+ return nil, err
255255+ }
256256+ defer rows.Close()
257257+258258+ var results []map[string]any
259259+ for rows.Next() {
260260+ var recUser, handle string
261261+ var jaccard float64
262262+ var commonFeeds int
263263+ var displayName, avatarURL sql.NullString
264264+ if err := rows.Scan(&recUser, &jaccard, &commonFeeds, &handle, &displayName, &avatarURL); err != nil {
265265+ return nil, err
266266+ }
267267+ results = append(results, map[string]any{
268268+ "did": recUser,
269269+ "jaccard": jaccard,
270270+ "common_feeds": commonFeeds,
271271+ "handle": handle,
272272+ "display_name": displayName,
273273+ "avatar_url": avatarURL,
274274+ })
275275+ }
276276+ return results, rows.Err()
277277+}
278278+279279+func (db *DB) GetSimilarFeeds(ctx context.Context, feedURL string, limit int) ([]*Feed, error) {
280280+ rows, err := db.QueryContext(ctx, `
281281+ SELECT f.feed_url, f.title, f.site_url, f.description, f.feed_type,
282282+ f.last_fetched_at, f.last_error, f.subscriber_count, f.etag, f.last_modified,
283283+ f.fetch_interval_minutes, f.next_fetch_at, f.consecutive_empty_fetches, f.error_count
284284+ FROM feed_similarity fs
285285+ JOIN feeds f ON f.feed_url = CASE WHEN fs.feed_a = ? THEN fs.feed_b ELSE fs.feed_a END
286286+ WHERE fs.feed_a = ? OR fs.feed_b = ?
287287+ ORDER BY fs.jaccard DESC
288288+ LIMIT ?
289289+ `, feedURL, feedURL, feedURL, limit)
290290+ if err != nil {
291291+ return nil, err
292292+ }
293293+ defer rows.Close()
294294+295295+ var feeds []*Feed
296296+ for rows.Next() {
297297+ f := &Feed{}
298298+ if err := rows.Scan(&f.FeedURL, &f.Title, &f.SiteURL, &f.Description, &f.FeedType,
299299+ &f.LastFetchedAt, &f.LastError, &f.SubscriberCount, &f.Etag, &f.LastModified,
300300+ &f.FetchIntervalMinutes, &f.NextFetchAt, &f.ConsecutiveEmptyFetches, &f.ErrorCount); err != nil {
301301+ return nil, err
302302+ }
303303+ feeds = append(feeds, f)
304304+ }
305305+ return feeds, rows.Err()
306306+}
+164
internal/db/db.go
···11+package db
22+33+import (
44+ "database/sql"
55+ _ "modernc.org/sqlite"
66+)
77+88+type DB struct {
99+ *sql.DB
1010+}
1111+1212+func Open(path string) (*DB, error) {
1313+ db, err := sql.Open("sqlite", path)
1414+ if err != nil {
1515+ return nil, err
1616+ }
1717+1818+ db.SetMaxOpenConns(1)
1919+2020+ if err := migrate(db); err != nil {
2121+ db.Close()
2222+ return nil, err
2323+ }
2424+2525+ return &DB{db}, nil
2626+}
2727+2828+func migrate(db *sql.DB) error {
2929+ tx, err := db.Begin()
3030+ if err != nil {
3131+ return err
3232+ }
3333+ defer func() { _ = tx.Rollback() }()
3434+3535+ stmts := []string{
3636+ `CREATE TABLE IF NOT EXISTS users (
3737+ did TEXT PRIMARY KEY,
3838+ handle TEXT NOT NULL,
3939+ display_name TEXT,
4040+ avatar_url TEXT,
4141+ indexed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
4242+ updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
4343+ )`,
4444+ `CREATE TABLE IF NOT EXISTS feeds (
4545+ feed_url TEXT PRIMARY KEY,
4646+ title TEXT,
4747+ site_url TEXT,
4848+ description TEXT,
4949+ feed_type TEXT CHECK(feed_type IN ('rss', 'atom', 'json')),
5050+ last_fetched_at DATETIME,
5151+ last_error TEXT,
5252+ subscriber_count INTEGER NOT NULL DEFAULT 0,
5353+ etag TEXT,
5454+ last_modified TEXT,
5555+ fetch_interval_minutes INTEGER NOT NULL DEFAULT 30,
5656+ next_fetch_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
5757+ consecutive_empty_fetches INTEGER NOT NULL DEFAULT 0,
5858+ error_count INTEGER NOT NULL DEFAULT 0
5959+ )`,
6060+ `CREATE TABLE IF NOT EXISTS subscriptions (
6161+ id INTEGER PRIMARY KEY AUTOINCREMENT,
6262+ user_did TEXT NOT NULL REFERENCES users(did),
6363+ feed_url TEXT NOT NULL REFERENCES feeds(feed_url),
6464+ category TEXT,
6565+ added_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
6666+ UNIQUE(user_did, feed_url)
6767+ )`,
6868+ `CREATE TABLE IF NOT EXISTS articles (
6969+ id INTEGER PRIMARY KEY AUTOINCREMENT,
7070+ feed_url TEXT NOT NULL REFERENCES feeds(feed_url),
7171+ guid TEXT NOT NULL,
7272+ title TEXT NOT NULL DEFAULT '',
7373+ url TEXT,
7474+ author TEXT,
7575+ summary TEXT,
7676+ content TEXT,
7777+ published DATETIME,
7878+ updated DATETIME,
7979+ fetched_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
8080+ UNIQUE(feed_url, guid)
8181+ )`,
8282+ `CREATE TABLE IF NOT EXISTS read_state (
8383+ user_did TEXT NOT NULL REFERENCES users(did),
8484+ article_id INTEGER NOT NULL REFERENCES articles(id),
8585+ is_read BOOLEAN NOT NULL DEFAULT 0,
8686+ read_at DATETIME,
8787+ is_starred BOOLEAN NOT NULL DEFAULT 0,
8888+ starred_at DATETIME,
8989+ PRIMARY KEY (user_did, article_id)
9090+ )`,
9191+ `CREATE TABLE IF NOT EXISTS annotations (
9292+ id INTEGER PRIMARY KEY AUTOINCREMENT,
9393+ uri TEXT NOT NULL UNIQUE,
9494+ author_did TEXT NOT NULL REFERENCES users(did),
9595+ feed_url TEXT NOT NULL,
9696+ article_url TEXT NOT NULL,
9797+ quote TEXT,
9898+ note TEXT,
9999+ tags TEXT,
100100+ rating INTEGER,
101101+ created_at DATETIME NOT NULL,
102102+ cid TEXT
103103+ )`,
104104+ `CREATE TABLE IF NOT EXISTS likes (
105105+ id INTEGER PRIMARY KEY AUTOINCREMENT,
106106+ uri TEXT NOT NULL UNIQUE,
107107+ author_did TEXT NOT NULL REFERENCES users(did),
108108+ feed_url TEXT NOT NULL,
109109+ article_url TEXT NOT NULL,
110110+ created_at DATETIME NOT NULL,
111111+ cid TEXT,
112112+ UNIQUE(author_did, feed_url, article_url)
113113+ )`,
114114+ `CREATE TABLE IF NOT EXISTS feed_similarity (
115115+ feed_a TEXT NOT NULL REFERENCES feeds(feed_url),
116116+ feed_b TEXT NOT NULL REFERENCES feeds(feed_url),
117117+ jaccard REAL NOT NULL,
118118+ computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
119119+ PRIMARY KEY (feed_a, feed_b),
120120+ CHECK(feed_a < feed_b)
121121+ )`,
122122+ `CREATE TABLE IF NOT EXISTS user_similarity (
123123+ user_a TEXT NOT NULL REFERENCES users(did),
124124+ user_b TEXT NOT NULL REFERENCES users(did),
125125+ jaccard REAL NOT NULL,
126126+ common_feeds INTEGER NOT NULL,
127127+ computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
128128+ PRIMARY KEY (user_a, user_b),
129129+ CHECK(user_a < user_b)
130130+ )`,
131131+ `CREATE TABLE IF NOT EXISTS user_feed_recommendations (
132132+ user_did TEXT NOT NULL REFERENCES users(did),
133133+ feed_url TEXT NOT NULL REFERENCES feeds(feed_url),
134134+ score REAL NOT NULL,
135135+ computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
136136+ PRIMARY KEY (user_did, feed_url)
137137+ )`,
138138+ `CREATE TABLE IF NOT EXISTS user_article_recommendations (
139139+ user_did TEXT NOT NULL REFERENCES users(did),
140140+ feed_url TEXT NOT NULL,
141141+ article_url TEXT NOT NULL,
142142+ score REAL NOT NULL,
143143+ computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
144144+ PRIMARY KEY (user_did, feed_url, article_url)
145145+ )`,
146146+ `CREATE INDEX IF NOT EXISTS idx_subscriptions_feed ON subscriptions(feed_url)`,
147147+ `CREATE INDEX IF NOT EXISTS idx_subscriptions_user ON subscriptions(user_did)`,
148148+ `CREATE INDEX IF NOT EXISTS idx_articles_feed ON articles(feed_url)`,
149149+ `CREATE INDEX IF NOT EXISTS idx_articles_published ON articles(published DESC)`,
150150+ `CREATE INDEX IF NOT EXISTS idx_read_state_unread ON read_state(user_did, is_read) WHERE is_read = 0`,
151151+ `CREATE INDEX IF NOT EXISTS idx_read_state_starred ON read_state(user_did, is_starred) WHERE is_starred = 1`,
152152+ `CREATE INDEX IF NOT EXISTS idx_annotations_article ON annotations(article_url)`,
153153+ `CREATE INDEX IF NOT EXISTS idx_likes_article ON likes(feed_url, article_url)`,
154154+ `CREATE INDEX IF NOT EXISTS idx_likes_author ON likes(author_did)`,
155155+ }
156156+157157+ for _, s := range stmts {
158158+ if _, err := tx.Exec(s); err != nil {
159159+ return err
160160+ }
161161+ }
162162+163163+ return tx.Commit()
164164+}