A social RSS reader built on the AT Protocol. glean.at
glean atproto atmosphere rss feed social app
14
fork

Configure Feed

Select the types of activity you want to include in your feed.

Update docs

+69 -56
+1
.env.example
··· 6 6 GLEAN_CLUSTER_INTERVAL=15m 7 7 GLEAN_FETCH_INTERVAL=5m 8 8 GLEAN_COLLECTION_DIR_URL=https://lightrail.microcosm.blue/xrpc/com.atproto.sync.listReposByCollection?collection=at.glean.subscription 9 + GLEAN_BACKFILL_CONCURRENCY=5 9 10 # Leave empty for localhost OAuth (development) 10 11 # GLEAN_OAUTH_CLIENT_ID=https://glean.at/oauth/client-metadata 11 12 # GLEAN_OAUTH_REDIRECT_URL=https://glean.at/auth/callback
+11 -11
docs/design.md
··· 158 158 159 159 ## 8. Page Structure 160 160 161 - | Page | Layout | Key Features | 162 - | --------------- | ---------------------- | ---------------------------------------------- | 163 - | Index (landing) | Full-width, no sidebar | Hero with mockup, feature cols, dark band, CTA | 164 - | Login | Centered card | Bluesky + Atmosphere sign-in buttons | 165 - | Dashboard | 2/3 + 1/3 grid | Articles + trending/recommendations sidebar | 166 - | Articles | Full-width list | Keyboard nav (j/k/o/m), mark-all-read | 167 - | Article Detail | `max-w-3xl` centered | Content, like/share/read buttons, annotations | 161 + | Page | Layout | Key Features | 162 + | --------------- | ---------------------- | -------------------------------------------------------------- | 163 + | Index (landing) | Full-width, no sidebar | Hero with mockup, feature cols, dark band, CTA | 164 + | Login | Centered card | Bluesky + Atmosphere sign-in buttons | 165 + | Dashboard | 2/3 + 1/3 grid | Articles + trending/recommendations sidebar | 166 + | Articles | Full-width list | Keyboard nav (j/k/o/m), mark-all-read | 167 + | Article Detail | `max-w-3xl` centered | Content, like/share/read buttons, annotations | 168 168 | Feeds | 2/3 + 1/3 grid | Feed list with categories + add/import sidebar, refresh button | 169 - | Trending | Full-width list | Like/annotation counts on each article | 170 - | Discover | Mixed grid | Recommendations + people + browse all | 171 - | Annotations | Full-width list | Filter by article URL, load more | 172 - | Profile | `max-w-2xl` centered | Avatar, stats, feeds, annotations | 169 + | Trending | Full-width list | Like/annotation counts on each article | 170 + | Discover | Mixed grid | Recommendations + people + browse all | 171 + | Annotations | Full-width list | Filter by article URL, load more | 172 + | Profile | `max-w-2xl` centered | Avatar, stats, feeds, annotations |
+56 -45
docs/specs.md
··· 13 13 | Layer | Technology | 14 14 | ---------------- | --------------------------------------------------------------- | 15 15 | Backend | Go | 16 - | Database | SQLite (3 files: users, articles, recs via `mattn/go-sqlite3`) | 16 + | Database | SQLite (3 files: users, articles, recs via `mattn/go-sqlite3`) | 17 17 | Frontend | htmx + TailwindCSS | 18 18 | Auth | AT Protocol OAuth / DID resolution (configurable PLC directory) | 19 19 | AT Protocol role | AppView for `at.glean.*` lexicons | ··· 374 374 ```sql 375 375 CREATE TABLE articles ( 376 376 id INTEGER PRIMARY KEY AUTOINCREMENT, 377 - feed_url TEXT NOT NULL REFERENCES feeds(feed_url), 377 + feed_url TEXT NOT NULL, 378 378 guid TEXT NOT NULL, 379 379 title TEXT NOT NULL DEFAULT '', 380 380 url TEXT, ··· 400 400 401 401 ```sql 402 402 CREATE TABLE read_state ( 403 - user_did TEXT NOT NULL REFERENCES users(did), 404 - article_id INTEGER NOT NULL REFERENCES articles(id), 403 + user_did TEXT NOT NULL, 404 + article_id INTEGER NOT NULL, 405 405 is_read BOOLEAN NOT NULL DEFAULT 0, 406 406 read_at DATETIME, 407 407 PRIMARY KEY (user_did, article_id) ··· 477 477 478 478 Glean uses three separate SQLite database files to reduce write-lock contention. Each is opened with its own connection pool: 479 479 480 - | File | Contents | ATTACH alias | 481 - |------|----------|-------------| 482 - | `<base>_users` | Users, follows, OAuth | `main` (primary) | 483 - | `<base>_articles` | Feeds, subscriptions, articles, read state, likes, annotations | `articles` | 484 - | `<base>_recs` | Similarity scores, impressions, dismissals, signal weights | `recs` | 480 + | File | Contents | ATTACH alias | 481 + | ----------------- | -------------------------------------------------------------- | ---------------- | 482 + | `<base>_users` | Users, follows, OAuth | `main` (primary) | 483 + | `<base>_articles` | Feeds, subscriptions, articles, read state, likes, annotations | `articles` | 484 + | `<base>_recs` | Similarity scores, impressions, dismissals, signal weights | `recs` | 485 485 486 486 The users connection pool uses a custom SQLite driver with a `ConnectHook` that ATTACHes the articles and recs databases on every new connection. This allows the cluster engine to run cross-database queries using schema prefixes (`articles.subscriptions`, `recs.user_similarity`, `main.follows`). 487 487 488 - The articles and recs connections are independent pools used by the service layer for single-database operations. 488 + Foreign key constraints are not used because SQLite does not support foreign keys across ATTACHed databases. Referential integrity is enforced by the application layer. 489 489 490 490 ### 6.1 Users (`<base>_users`) 491 + 492 + Profile data (handle, display name, avatar) is resolved on-the-fly via AT Protocol identity resolution rather than stored locally. 491 493 492 494 ```sql 493 495 CREATE TABLE users ( 494 496 did TEXT PRIMARY KEY, 495 - handle TEXT NOT NULL, 496 - display_name TEXT, 497 - avatar_url TEXT, 498 497 indexed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, 499 498 updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP 500 499 ); ··· 507 506 ```sql 508 507 CREATE TABLE subscriptions ( 509 508 id INTEGER PRIMARY KEY AUTOINCREMENT, 510 - user_did TEXT NOT NULL REFERENCES users(did), 509 + user_did TEXT NOT NULL, 511 510 feed_url TEXT NOT NULL, 512 511 title TEXT, 513 512 category TEXT, ··· 537 536 subscriber_count INTEGER NOT NULL DEFAULT 0, 538 537 etag TEXT, 539 538 last_modified TEXT, 540 - fetch_interval_minutes INTEGER NOT NULL DEFAULT 30, 541 - next_fetch_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, 542 539 consecutive_empty_fetches INTEGER NOT NULL DEFAULT 0, 543 540 error_count INTEGER NOT NULL DEFAULT 0, 544 541 favicon_url TEXT ··· 552 549 ```sql 553 550 CREATE TABLE articles ( 554 551 id INTEGER PRIMARY KEY AUTOINCREMENT, 555 - feed_url TEXT NOT NULL REFERENCES feeds(feed_url), 552 + feed_url TEXT NOT NULL, 556 553 guid TEXT NOT NULL, 557 554 title TEXT NOT NULL DEFAULT '', 558 555 url TEXT, 559 556 author TEXT, 560 557 summary TEXT, 561 558 content TEXT, 559 + full_content TEXT, 562 560 published DATETIME, 563 561 updated DATETIME, 564 562 fetched_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, ··· 573 571 574 572 ```sql 575 573 CREATE TABLE read_state ( 576 - user_did TEXT NOT NULL REFERENCES users(did), 577 - article_id INTEGER NOT NULL REFERENCES articles(id), 574 + user_did TEXT NOT NULL, 575 + article_id INTEGER NOT NULL, 578 576 is_read BOOLEAN NOT NULL DEFAULT 0, 579 577 read_at DATETIME, 580 578 PRIMARY KEY (user_did, article_id) ··· 591 589 CREATE TABLE annotations ( 592 590 id INTEGER PRIMARY KEY AUTOINCREMENT, 593 591 uri TEXT NOT NULL UNIQUE, 594 - author_did TEXT NOT NULL REFERENCES users(did), 592 + author_did TEXT NOT NULL, 595 593 feed_url TEXT NOT NULL, 596 594 article_url TEXT NOT NULL, 597 595 quote TEXT, ··· 605 603 CREATE TABLE likes ( 606 604 id INTEGER PRIMARY KEY AUTOINCREMENT, 607 605 uri TEXT NOT NULL UNIQUE, 608 - author_did TEXT NOT NULL REFERENCES users(did), 606 + author_did TEXT NOT NULL, 609 607 feed_url TEXT NOT NULL, 610 608 article_url TEXT NOT NULL, 611 609 created_at DATETIME NOT NULL, ··· 620 618 621 619 ```sql 622 620 CREATE TABLE feed_similarity ( 623 - feed_a TEXT NOT NULL REFERENCES feeds(feed_url), 624 - feed_b TEXT NOT NULL REFERENCES feeds(feed_url), 621 + feed_a TEXT NOT NULL, 622 + feed_b TEXT NOT NULL, 625 623 jaccard REAL NOT NULL, 626 624 computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, 627 625 PRIMARY KEY (feed_a, feed_b), ··· 629 627 ); 630 628 631 629 CREATE TABLE user_similarity ( 632 - user_a TEXT NOT NULL REFERENCES users(did), 633 - user_b TEXT NOT NULL REFERENCES users(did), 630 + user_a TEXT NOT NULL, 631 + user_b TEXT NOT NULL, 634 632 jaccard REAL NOT NULL, 635 633 common_feeds INTEGER NOT NULL, 634 + common_likes INTEGER NOT NULL DEFAULT 0, 635 + common_tags INTEGER NOT NULL DEFAULT 0, 636 636 computed_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, 637 637 PRIMARY KEY (user_a, user_b), 638 638 CHECK(user_a < user_b) ··· 645 645 646 646 ```sql 647 647 CREATE TABLE follows ( 648 - user_did TEXT NOT NULL REFERENCES users(did), 648 + user_did TEXT NOT NULL, 649 649 target_did TEXT NOT NULL, 650 650 uri TEXT, 651 651 cid TEXT, ··· 673 673 ); 674 674 ``` 675 675 676 + ## 7. Recommendations 677 + 676 678 Glean uses a multi-signal recommendation system that combines subscription overlap, like patterns, social graph distance, and user behavior feedback. 677 679 678 680 ### 7.1 Signals 679 681 680 - | Signal | Source | Weight (default) | Description | 681 - |--------|--------|-------------------|-------------| 682 - | Subscription | `subscriptions` | 1.0 | Jaccard over subscriber sets between similar users | 683 - | Like | `likes` | 0.5 | Time-decayed like co-occurrence (30-day half-life) | 684 - | Tag | `annotations.tags` | 0.3 | Jaccard over annotation tag sets | 685 - | Social | `follow_distances` | 0.7 | Follow distance: 1-hop=1.0, 2-hop=0.3 | 686 - | Popularity | `feeds.subscriber_count` | 0.2 | `log(1 + subscribers) / log(1 + max)` | 687 - | Category | `subscriptions.category` | 0.4 | Boost feeds matching user's existing categories | 682 + | Signal | Source | Weight (default) | Description | 683 + | ------------ | ------------------------ | ---------------- | -------------------------------------------------- | 684 + | Subscription | `subscriptions` | 1.0 | Jaccard over subscriber sets between similar users | 685 + | Like | `likes` | 0.5 | Time-decayed like co-occurrence (30-day half-life) | 686 + | Tag | `annotations.tags` | 0.3 | Jaccard over annotation tag sets | 687 + | Social | `follow_distances` | 0.7 | Follow distance: 1-hop=1.0, 2-hop=0.3 | 688 + | Popularity | `feeds.subscriber_count` | 0.2 | `log(1 + subscribers) / log(1 + max)` | 689 + | Category | `subscriptions.category` | 0.4 | Boost feeds matching user's existing categories | 688 690 689 691 ### 7.2 Feed Co-occurrence (Jaccard Similarity) 690 692 ··· 721 723 ``` 722 724 723 725 Where: 726 + 724 727 - `sub_signal = SUM(jaccard(target, U))` for similar users U subscribed to feed 725 728 - `like_signal = SUM(jaccard(target, U) * time_decay)` for likes in that feed by similar users 726 729 - `social_signal = SUM(distance_weight)` from follow_distances ··· 794 797 795 798 Jetstream ingestion and record indexing happen in a separate persistent goroutine (the Jetstream consumer), not in the cron. 796 799 797 - ### 7.11 New Database Tables 800 + ### 7.11 Recommendation Tables (`<base>_recs`) 798 801 799 802 ```sql 800 803 CREATE TABLE dismissed_recommendations ( 801 - user_did TEXT NOT NULL REFERENCES users(did), 804 + user_did TEXT NOT NULL, 802 805 target_type TEXT NOT NULL CHECK(target_type IN ('feed', 'article')), 803 806 target_id TEXT NOT NULL, 804 807 reason TEXT, ··· 807 810 ); 808 811 809 812 CREATE TABLE recommendation_impressions ( 810 - user_did TEXT NOT NULL REFERENCES users(did), 813 + user_did TEXT NOT NULL, 811 814 target_type TEXT NOT NULL CHECK(target_type IN ('feed', 'article')), 812 815 target_id TEXT NOT NULL, 813 816 first_shown_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, ··· 825 828 ); 826 829 827 830 CREATE TABLE user_signal_weights ( 828 - user_did TEXT PRIMARY KEY REFERENCES users(did), 831 + user_did TEXT PRIMARY KEY, 829 832 w_sub REAL NOT NULL DEFAULT 1.0, 830 833 w_like REAL NOT NULL DEFAULT 0.5, 831 834 w_tag REAL NOT NULL DEFAULT 0.3, ··· 836 839 ); 837 840 838 841 CREATE TABLE user_signal_profiles ( 839 - user_did TEXT PRIMARY KEY REFERENCES users(did), 842 + user_did TEXT PRIMARY KEY, 840 843 total_likes INTEGER NOT NULL DEFAULT 0, 841 844 total_tags INTEGER NOT NULL DEFAULT 0, 842 845 top_categories TEXT, ··· 861 864 | `/feeds/add` | POST | Add a single feed URL | 862 865 | `/feeds/remove` | DELETE | Remove a feed | 863 866 | `/feeds/refresh` | POST | Refresh all subscribed feeds | 864 - | `/feeds/clear` | POST | Clear all subscriptions | 865 - | `/feeds/dismiss` | POST | Dismiss a feed recommendation | 867 + | `/feeds/retry` | POST | Retry a failed feed | 868 + | `/feeds/clear` | POST | Clear all subscriptions | 869 + | `/feeds/dismiss` | POST | Dismiss a feed recommendation | 866 870 | `/articles` | GET | Read articles (paginated, filterable by feed) | 871 + | `/articles/new-count` | GET | Get count of new articles (for badge updates) | 867 872 | `/articles/{id}` | GET | Article detail view | 868 873 | `/articles/{id}/read` | POST | Mark article as read | 869 874 | `/articles/{id}/unread` | POST | Mark article as unread | 870 875 | `/articles/{id}/like` | POST | Like an article | 871 876 | `/articles/{id}/fetch-content` | POST | Fetch full article content from original URL | 872 - | `/articles/mark-all-read` | POST | Mark all articles as read | 873 - | `/articles/dismiss` | POST | Dismiss an article recommendation | 874 - | `/trending` | GET | Community feed: articles ranked by likes | 877 + | `/articles/mark-all-read` | POST | Mark all articles as read | 878 + | `/articles/dismiss` | POST | Dismiss an article recommendation | 879 + | `/trending` | GET | Community feed: articles ranked by likes | 875 880 | `/library` | GET | Liked articles and annotations | 876 881 | `/library/create` | POST | Create annotation on an article | 877 882 | `/library/{id}/delete` | POST | Delete an annotation | ··· 926 931 │ │ ├── fetcher.go # Scheduler with dedup + Fetcher 927 932 │ │ ├── discover.go # Feed auto-discovery from URLs 928 933 │ │ └── opml.go # OPML import/export 934 + │ ├── httpclient/ 935 + │ │ └── httpclient.go # Shared HTTP transport, retry logic, User-Agent 929 936 │ ├── scraper/ 930 937 │ │ └── scraper.go # Full article content scraper 931 938 │ ├── metrics/ ··· 965 972 │ ├── trending.html # Trending articles 966 973 │ ├── library.html # Liked articles + annotations 967 974 │ ├── profile.html # User profile 975 + │ ├── error.html # Error page 968 976 │ ├── 404.html # Not found page 969 977 │ └── partials/ # Reusable template fragments 970 978 ├── static/ ··· 1051 1059 1052 1060 Glean exposes a `/metrics` endpoint for monitoring. Key metrics: 1053 1061 1054 - - **`glean_feeds_fetched_total`** — Feed fetch attempts labeled by status (`success`, `error`, `not_modified`) 1062 + - **`glean_feeds_fetched_total`** — Total feed fetch attempts (counter) 1063 + - **`glean_feeds_fetched_last_timestamp_seconds`** — Unix timestamp of last feed fetch (gauge) 1055 1064 - **`glean_feed_fetch_duration_seconds`** — Histogram of feed fetch latency 1056 1065 - **`glean_articles_upserted_total`** — Counter of articles stored from feeds 1057 1066 - **`glean_jetstream_events_total`** — Jetstream events labeled by collection and action 1058 1067 - **`glean_jetstream_errors_total`** — Jetstream handler errors 1059 1068 - **`glean_jetstream_reconnects_total`** — Jetstream reconnection count 1060 1069 - **`glean_http_requests_total`** — HTTP request counts labeled by method, path, and status 1070 + - **`glean_http_request_duration_seconds`** — HTTP request duration labeled by method and path 1071 + - **`glean_users_active_total`** — Number of users with active sessions 1061 1072 - **`glean_pds_sync_runs_total`** / **`glean_pds_sync_errors_total`** — PDS sync runs and errors 1062 1073 - **`glean_cluster_runs_total`** / **`glean_cluster_duration_seconds`** — Recommendation engine runs and timing 1063 1074
+1
readme.md
··· 46 46 | `GLEAN_CLUSTER_INTERVAL` | `10m` | Cluster recomputation interval (Go duration) | 47 47 | `GLEAN_FETCH_INTERVAL` | `5m` | Feed fetch scheduler tick interval (Go duration) | 48 48 | `GLEAN_COLLECTION_DIR_URL` | _(empty)_ | Collection directory URL for startup backfill | 49 + | `GLEAN_BACKFILL_CONCURRENCY` | `5` | Max concurrent backfill workers | 49 50 | `GLEAN_PLC_URL` | `https://didplc.glean.at` | PLC directory URL for DID resolution | 50 51 | `GLEAN_OAUTH_CLIENT_ID` | _(empty)_ | OAuth client metadata URL (leave empty for localhost dev) | 51 52 | `GLEAN_OAUTH_REDIRECT_URL` | _(empty)_ | OAuth redirect URL (leave empty for localhost dev) |