···11# Malfestio
2233-Malfestio is a learning OS: flashcards + notes + lectures + articles, designed for daily study.
33+Malfestio is a social learning platform: flashcards + notes + lectures + articles, designed for daily study.
4455Social layer: publish/share/remix learning artifacts; follow curators; discuss.
6677-## Personas
88-99-- **Learner**: studies daily; imports content; wants fast "review queue".
1010-- **Creator**: makes decks/notes; publishes updates; wants feedback + forks.
1111-- **Curator/Teacher**: bundles content into learning paths; annotates lectures/articles.
1212-- **Moderator/Community admin**: handles reports, takedowns, spam.
1313-1414-## Principles
1515-1616-- Local-first study experience; offline study must not feel "second-class".
1717-- Shareable artifacts are portable: Lexicon-defined schemas + stable IDs.
1818-- Privacy by design: progress + recall history are private unless explicitly
1919- shared.
2020-2121-### Data Model
77+## Documentation
2282323-- Note: markdown + structure + citations + links to sources.
2424-- Card: front/back (+ optional cloze, audio, image, code block).
2525-- Deck: ordered/clustered cards (+ metadata, tags).
2626-- Lecture: external URL + outline + timestamps + linked notes/cards.
2727-- Article: URL + extracted text (readability style heuristics) + highlights +
2828- linked notes/cards.
2929-- Collection/Path: curated bundle of decks + notes + sources.
3030-3131-## System Architecture
3232-3333-### Frontend (SolidJS)
3434-3535-- App shell + router-driven workspaces (Library / Study / Create / Social).
3636-- Signals as primary state primitive; keep study session state in signals/store.
3737-3838-### Backend (Rust)
3939-4040-- Axum API gateway: REST/XRPC-ish endpoints, tower middleware, typed extractors.
4141-- Services (logical, not necessarily microservices):
4242- - Identity/Auth service (local + optional ATProto OAuth integration)
4343- - Content service (notes/cards/decks/sources)
4444- - Study service (queue generation + grading + scheduling)
4545- - Social service (follows, feeds, comments, notifications)
4646- - Search service (indexing + query)
4747- - Moderation service (reports, takedowns, rules)
4848-4949-### Storage
5050-5151-- Postgres: canonical app DB (users, private study state, cache of published records).
5252-- Object storage: images/audio, extracted article snapshots (if you store them).
5353-- Search index: separate system (Meilisearch/Typesense/ZincSearch-pick one later).
5454-5555-### Eventing
5656-5757-- Internal outbox pattern (DB table) for:
5858- - reindex jobs, notification fanout, federation publish steps
99+- [Personas & Principles](./docs/personas.md) – Target users and design philosophy
1010+- [Architecture](./docs/architecture.md) – System components and data model
1111+- [Information Architecture](./docs/information-architecture.md) – Navigation and URL structure
1212+- [Data Model Mapping](./docs/data-model-mapping.md) – Lexicon to database mapping
1313+- [Roadmap](./docs/todo.md) – Development milestones
+43
docs/architecture.md
···11+# System Architecture
22+33+This document describes the technical architecture of Malfestio.
44+55+## Frontend (SolidJS)
66+77+- App shell + router-driven workspaces (Library / Study / Create / Social).
88+- Signals as primary state primitive; keep study session state in signals/store.
99+1010+## Backend (Rust)
1111+1212+- Axum API gateway: REST/XRPC-ish endpoints, tower middleware, typed extractors.
1313+- Services (logical, not necessarily microservices):
1414+ - Identity/Auth service (local + optional ATProto OAuth integration)
1515+ - Content service (notes/cards/decks/sources)
1616+ - Study service (queue generation + grading + scheduling)
1717+ - Social service (follows, feeds, comments, notifications)
1818+ - Search service (indexing + query)
1919+ - Moderation service (reports, takedowns, rules)
2020+2121+## Storage
2222+2323+- Postgres: canonical app DB (users, private study state, cache of published records).
2424+- Object storage: images/audio, extracted article snapshots (if you store them).
2525+- Search index: separate system (Meilisearch/Typesense/ZincSearch-pick one later).
2626+2727+## Eventing
2828+2929+- Internal outbox pattern (DB table) for:
3030+ - reindex jobs, notification fanout, federation publish steps
3131+3232+## Data Model
3333+3434+See [Data Model Mapping](./data-model-mapping.md) for the mapping between public Lexicon records and internal database tables.
3535+3636+### Core Entities
3737+3838+- **Note**: markdown + structure + citations + links to sources.
3939+- **Card**: front/back (+ optional cloze, audio, image, code block).
4040+- **Deck**: ordered/clustered cards (+ metadata, tags).
4141+- **Lecture**: external URL + outline + timestamps + linked notes/cards.
4242+- **Article**: URL + extracted text (readability style heuristics) + highlights + linked notes/cards.
4343+- **Collection/Path**: curated bundle of decks + notes + sources.
+55-11
docs/at-notes.md
···11# AT Protocol Research Notes
2233+Reference material for AT Protocol integration. For implementation details, see [todo.md](todo.md).
44+35## OAuth 2.1 Specification
4657AT Protocol uses a specific profile of OAuth 2.1 for client↔PDS authorization.
···44464547Example: `at://did:plc:abc123/app.malfestio.deck/3k5abc123`
46484747-## Firehose Consumption
4949+## Firehose / Jetstream
48504949-For social features (trending, discovery, feeds):
5151+### Raw Firehose
50525151-- **WebSocket Connection**: Subscribe to `com.atproto.sync.subscribeRepos` from a Relay
5252-- **CBOR Decoding**: Parse incoming events (or use Jetstream for JSON)
5353+- **WebSocket**: Subscribe to `com.atproto.sync.subscribeRepos` from a Relay
5454+- **CBOR Decoding**: Parse incoming events
5355- **Cursor Management**: Track position for reconnection
54565555-## AppView Pattern
5757+### Jetstream (Recommended)
56585757-Index network-wide records to power discovery features:
5959+Bluesky's simplified JSON firehose:
58605959-- Index `app.malfestio.*` records from firehose
6060-- Implement `getFeedSkeleton` for custom algorithmic feeds
6161-- Hydration service combines skeletons with full content from PDSes
6161+- JSON format (no CBOR decoding)
6262+- Reduced bandwidth (zstd compression)
6363+- Collection/repo filtering at source
6464+- Simpler reconnection with cursors
62656366## Well-Known Endpoints
6467···6669- `/.well-known/oauth-protected-resource` — PDS OAuth metadata
6770- `/.well-known/oauth-authorization-server` — Auth server metadata
68717272+## Labelers
7373+7474+**Architecture:**
7575+7676+1. Labels = metadata (source DID + subject AT-URI + value string)
7777+2. User Subscription = users subscribe to labelers; clients include in API requests
7878+3. Label Interpretation = per-user config to hide, warn, or ignore content
7979+8080+**Structure:**
8181+8282+```json
8383+{
8484+ "src": "did:plc:labeler",
8585+ "uri": "at://did:user/app.bsky.feed.post/123",
8686+ "val": "spam",
8787+ "cts": "2026-01-01T00:00:00Z"
8888+}
8989+```
9090+9191+## Feeds
9292+9393+**Core Flow**:
9494+9595+1. User requests feed via at-uri of declared feed
9696+2. PDS resolves at-uri → Feed Generator's DID doc
9797+3. PDS sends `getFeedSkeleton` to service endpoint (authenticated by user's JWT)
9898+4. Feed Generator returns skeleton (list of post URIs + cursor)
9999+5. PDS hydrates skeleton with full content (via AppView)
100100+6. Hydrated feed returned to user
101101+102102+## AppView
103103+104104+**Responsibilities**:
105105+106106+1. Record Processing & Indexing - consume firehose, build indices for likes, threads, follows
107107+2. Moderation Enforcement - apply labels from subscribed labelers
108108+3. Query Interface - expose XRPC API (proxied through PDS)
109109+4. Media CDN - fetch/cache blobs from upstream PDSes, generate thumbnails
110110+5. Search & Discovery - full-text search, type-ahead, content ranking
111111+69112## Patterns from Real AT Protocol Apps
7011371114### plyr.fm (Music)
···73116- OAuth 2.1 via `@atproto/oauth-client` library
74117- Records synced to PDS: tracks, likes, playlists
75118- Separate moderation service (Rust labeler)
7676-- Data ownership: "tracks, likes, playlists synced to your PDS as ATProto records"
7711978120### leaflet.pub (Writing)
7912180122- React/Next.js frontend with Supabase + Replicache for sync
81123- Bluesky integration via dedicated `lexicons/` and `appview/` directories
8282-- Publications posted to Bluesky
8312484125### wisp.place (Static Sites)
85126···101142- [Repository & XRPC](https://atproto.com/specs/xrpc)
102143- [Feed Generator Starter Kit](https://github.com/bluesky-social/feed-generator)
103144- [atproto TypeScript SDK](https://github.com/bluesky-social/atproto)
145145+- [Ozone Moderation Service](https://github.com/bluesky-social/ozone)
146146+- [Jetstream Firehose](https://docs.bsky.app/blog/jetstream)
147147+- [Labels and Moderation Guide](https://docs.bsky.app/docs/advanced-guides/moderation)
+14
docs/personas.md
···11+# Personas & Principles
22+33+## Personas
44+55+- **Learner**: studies daily; imports content; wants fast "review queue".
66+- **Creator**: makes decks/notes; publishes updates; wants feedback + forks.
77+- **Curator/Teacher**: bundles content into learning paths; annotates lectures/articles.
88+- **Moderator/Community admin**: handles reports, takedowns, spam.
99+1010+## Design Principles
1111+1212+- Local-first study experience; offline study must not feel "second-class".
1313+- Shareable artifacts are portable: Lexicon-defined schemas + stable IDs.
1414+- Privacy by design: progress + recall history are private unless explicitly shared.
+224-33
docs/todo.md
···21212222- Lexicon defines record types + XRPC endpoints; JSON-schema-like constraints.
2323- Use "optional fields" heavily; avoid enums that will calcify the product too early.
2424-- Versioning: add fields, don’t rename; never rely on being able to rewrite history.
2424+- Versioning: add fields, don't rename; never rely on being able to rewrite history.
25252626### Schema boundaries (important)
2727···4545- **(Done) Milestone D**: Identity + Permissions + Publishing Model.
4646 - Auth MVP, Permission model (Private/Public/SharedWith), and basic Publishing flow implemented.
4747 - Backend API and Frontend Editor updated with tests covering permissions and publishing.
4848+- **(Done) Milestone E**: Internal component library/UI Foundation + Animations.
4849- **(Done) Milestone F**: OAuth + PDS Record Publishing.
4950 - OAuth 2.1 client flow (PKCE, DPoP, handle/DID resolution, token refresh).
5051 - PDS client for `putRecord`, `deleteRecord`, `uploadBlob`.
5152 - TID generation and AT-URI builder in core crate.
5252-- **(Done) Milestone E**: Internal component library/UI Foundation + Animations.
5353-- **(Done) Milestone F**: Content Authoring (Notes + Cards + Deck Builder).
5454-- **(Done) Milestone G**: Study Engine (SRS) + Daily Review UX.
5353+- **(Done) Milestone G**: Content Authoring (Notes + Cards + Deck Builder).
5454+- **(Done) Milestone H**: Study Engine (SRS) + Daily Review UX.
5555 - SM-2 spaced repetition scheduler.
5656-- **(Done) Milestone H**: Social Layer v1: Follow graph, Feeds (Follows/Trending), Forking workflow, and Threaded comments.
5757-- **(Done) Milestone I**: Search + Discovery + Taxonomy.
5656+- **(Done) Milestone I**: Social Layer v1: Follow graph, Feeds (Follows/Trending), Forking workflow, and Threaded comments.
5857 - Full-text search with pg_trgm/unaccent, visibility filtering, and unified search index.
5958 - Tag taxonomy and Discovery page with top tags.
60596161-### Milestone J - Moderation + Abuse Resistance
6060+### Milestone J - Static Content & Landing Page
6161+6262+#### Deliverables
6363+6464+**Marketing/Static Site:**
6565+6666+- [ ] Landing page with hero section, feature highlights, social proof
6767+ - Hero should have a graph paper/grid background
6868+ - Floating flash cards, notes
6969+- [ ] "How it works" section with study flow visualization
7070+- [ ] About page with team/mission
7171+7272+**App Vision Content:**
7373+7474+- [ ] Onboarding flow with persona selection (Learner/Creator/Curator)
7575+- [ ] Empty states with helpful prompts for new users
7676+- [ ] Tutorial/walkthrough for first deck creation
7777+- [ ] Help center or FAQ section -> Should mention that the app is still in development and subject to change.
7878+7979+**SEO & Meta:**
8080+8181+- [ ] Open Graph / Twitter Card meta tags
8282+ - [ ] Scripted with HTML2Canvas/PNG generation
8383+ - Graph paper background
8484+ - Floating flash cards, notes
8585+- [ ] Sitemap.xml generation
8686+- [ ] robots.txt configuration
8787+8888+#### Acceptance
8989+9090+- New visitors understand the value proposition within 10 seconds.
9191+- Onboarding flow guides users to create their first deck.
9292+9393+### Milestone K - AppView Indexing
9494+9595+#### Deliverables
9696+9797+**Firehose Enhancement:**
9898+9999+- [ ] Upgrade firehose consumer to store full record content (not just metadata)
100100+- [ ] Add `indexed_decks`, `indexed_cards`, `indexed_notes` tables for remote content
101101+- [ ] Track latest processed revision per repo; handle deletions
102102+103103+**Search & Discovery:**
104104+105105+- [ ] Implement search over indexed remote records (extend `search_items` view)
106106+- [ ] User profile aggregation: follower counts, deck counts from federated sources
107107+108108+**Export & Interop:**
109109+110110+- [ ] Export local records as valid Lexicon JSON (`/api/export/:collection`)
111111+- [ ] Read-only "federated library" view showing remote decks
112112+113113+#### Acceptance
114114+115115+- A deck published from Malfestio can be discovered via another AT Protocol client.
116116+- Remote decks from followed users appear in search results.
117117+118118+### Milestone L - ATProto Integration Pass
6211963120#### Deliverables
641216565-- Look into [Ozone](https://github.com/bluesky-social/ozone)
6666-- Reporting pipeline + review queue
6767-- Rate limits + spam heuristics
6868-- Takedown/visibility states (shadowed, removed, quarantined)
6969-- Audit logging for moderation actions
122122+**Identity & Auth:**
123123+124124+- [ ] OAuth login directly to user's PDS (vs. local-only auth)
125125+- [ ] Handle resolution via DNS TXT or `/.well-known/atproto-did`
126126+- [ ] DPoP token binding for secure API calls
127127+128128+**Sync & Conflict Resolution:**
129129+130130+- [ ] Bi-directional sync: local drafts → PDS records, PDS records → local cache
131131+- [ ] Conflict resolution strategy for concurrent edits (last-write-wins or merge UI)
132132+- [ ] Offline queue for pending publishes
133133+134134+**Deep Linking:**
135135+136136+- [ ] AT-URI deep linking from external clients
137137+- [ ] Handle `at://` URL scheme in app
7013871139#### Acceptance
721407373-- You can safely operate an open publishing surface.
141141+- User can log in with their existing Bluesky/PDS identity.
142142+- Local drafts sync correctly after reconnecting.
741437575-### Milestone K - Federation / ATProto Integration Pass
144144+#### Implementation Details
145145+146146+**Considerations:**
147147+148148+- Scalability: substantial compute; caching, DB optimization, distributed processing
149149+- Lexicon Validation: validate schemas, ignore invalid records gracefully
150150+- Account State: track latest processed revision per repo; handle deletions
151151+- Bluesky's AppView uses PostgreSQL or ScyllaDB + image proxy + AppView core
152152+153153+**Identity:**
154154+155155+- Use `did:web` for simplicity, `did:plc` for long-term stability
156156+- ATProto OAuth is the forward path
157157+158158+### Milestone M - Custom Feed Generator
7615977160#### Deliverables
781617979-- Phase 1 (minimum):
8080- - export Lexicon records
8181- - ingest remote records into a read-only "federated library"
8282-- Phase 2:
8383- - OAuth login to PDS + publish records directly (client or server mediated)
8484- - reconcile local drafts with remote published state
162162+**Infrastructure:**
163163+164164+- [ ] Feed Generator service with `did:web` identity
165165+- [ ] Publish `app.bsky.feed.generator` declaration record to creator's repo
166166+- [ ] DID document with service endpoint for feed requests
167167+168168+**Endpoints:**
169169+170170+- [ ] `app.bsky.feed.getFeedSkeleton` - Return post URIs + cursor for pagination
171171+- [ ] `app.bsky.feed.describeFeedGenerator` - Feed metadata (DID, name, description)
172172+- [ ] JWT authentication for user-personalized feeds
173173+174174+**Algorithms:**
175175+176176+- [ ] "Trending Decks" - Top decks by fork/like count in last 7 days
177177+- [ ] "New from Following" - Latest decks from followed creators
178178+- [ ] "Study Streak Leaders" - Decks with highest completion rates (anonymized)
179179+180180+**Indexing:**
181181+182182+- [ ] Subscribe to `com.atproto.sync.subscribeRepos` (or Jetstream) for `app.malfestio.*` records
183183+- [ ] Index posts with compound cursor (timestamp::CID) for deterministic pagination
184184+- [ ] Garbage collect indexed data older than 48 hours (except pinned content)
8518586186#### Acceptance
871878888-- A published artifact is portable beyond your app.
188188+- Custom feed appears in Bluesky and other AT Protocol clients.
189189+- Feed surfaces relevant learning content based on engagement signals.
190190+- Pagination works correctly across feed refreshes.
891919090-#### Notes
192192+#### Implementation Details
911939292-- ATProto OAuth is the forward path; plan on it.
9393-- XRPC endpoint patterns and legacy session behavior exist, but treat them as transitional.
194194+**Core Flow:** see [AT Notes](./at-notes.md#feeds)
941959595-### Milestone L - Reliability, Observability, Launch
196196+**Skeleton Response Format:**
197197+198198+```json
199199+{
200200+ "feed": [
201201+ {"post": "at://did:example/app.bsky.feed.post/1"},
202202+ {"post": "at://did:example/app.bsky.feed.post/2"}
203203+ ],
204204+ "cursor": "1683654690921::bafyrei..."
205205+}
206206+```
207207+208208+**Skeleton Metadata Types:**
209209+210210+```typescript
211211+type SkeletonItem = {
212212+ post: string // post URI
213213+ reason?: Reason // optional context (e.g., repost)
214214+}
215215+type ReasonRepost = {
216216+ $type: 'app.bsky.feed.defs#skeletonReasonRepost'
217217+ repost: string // repost URI
218218+}
219219+```
220220+221221+**Considerations:**
222222+223223+- Validate user JWTs if feed depends on user state (follows, likes)
224224+- Use compound cursor (timestamp::CID) for deterministic pagination
225225+- Most feeds can garbage collect data older than 48 hours
226226+- Reference: [Feed Generator Starter Kit](https://github.com/bluesky-social/feed-generator)
227227+228228+### Milestone N - Reliability, Observability, Launch
9622997230#### Deliverables
982319999-- Metrics + tracing + structured logs
100100-- Backups + restore drills
101101-- Load test targets (study session + feed + search)
102102-- Beta program + feedback loop + roadmap iteration
232232+**Observability:**
103233104104-## Open Questions (Parked Decisions)
234234+- [ ] Structured logging with correlation IDs
235235+- [ ] Metrics collection (Prometheus/OpenTelemetry)
236236+- [ ] Distributed tracing for request flows
237237+- [ ] Error tracking (Sentry or similar)
105238106106-- Local-first mechanics: full offline authoring + later publish, or online-only creation?
107107-- Federation depth: read-only ingest first, or publish-to-PDS in the first public beta?
108108-- Content extraction: store extracted article snapshots (legal/ops implications), or store only metadata + highlights?
239239+**Reliability:**
240240+241241+- [ ] Database backups + restore drills
242242+- [ ] Health check endpoints (`/health`, `/ready`)
243243+- [ ] Graceful shutdown handling
244244+- [ ] Circuit breakers for external dependencies
245245+246246+**Load Testing:**
247247+248248+- [ ] Study session throughput targets
249249+- [ ] Feed generation latency benchmarks
250250+- [ ] Search query performance under load
251251+252252+**Launch Prep:**
253253+254254+- [ ] Beta program signup flow
255255+- [ ] Feedback collection mechanism
256256+- [ ] Feature flags for gradual rollout
257257+258258+#### Acceptance
259259+260260+- System handles 10x expected load without degradation.
261261+- Mean time to recovery < 5 minutes for common failures.
262262+263263+### Milestone O - Moderation + Abuse Resistance
264264+265265+#### Deliverables
266266+267267+**Labeler Infrastructure:**
268268+269269+- [ ] Dedicated Bluesky account for labeler service
270270+- [ ] Publish `app.bsky.labeler.service` record to make discoverable
271271+- [ ] Self-host Ozone backend + UI (Docker setup in `HOSTING.md`)
272272+- [ ] Configure report types via `goat` CLI
273273+274274+**Endpoints:**
275275+276276+- [ ] `com.atproto.label.subscribeLabels` - real-time label stream
277277+- [ ] `com.atproto.label.queryLabels` - query published labels
278278+- [ ] `com.atproto.report.createReport` - accept user reports
279279+280280+**Moderation Features:**
281281+282282+- [ ] Reporting pipeline + review queue UI
283283+- [ ] Rate limits + spam heuristics
284284+- [ ] Takedown/visibility states (shadowed, removed, quarantined)
285285+- [ ] Audit logging for moderation actions
286286+287287+#### Acceptance
288288+289289+- You can safely operate an open publishing surface.
290290+- Users can subscribe to your labeler and see moderation applied.
291291+292292+**Reference:** [Ozone Moderation Service](https://github.com/bluesky-social/ozone)
293293+294294+## Open Question/Parked Decisions
295295+296296+- Full offline authoring + later publish
297297+- Federation depth: publish-to-PDS in the first public beta
298298+- Content extraction: store extracted article snapshots locally (browser)
299299+ - Persist only metadata + highlights