···11+# ATProto Feed Generator
22+33+🚧 Work in Progress 🚧
44+55+We are actively developing Feed Generator integration into the Bluesky PDS. Though we are reasonably confident about the general shape and interfaces laid out here, these interfaces and implementation details _are_ subject to change.
66+77+We've put together a starter kit for devs. It doesn't do everything, but it should be enough to get you familiar with the system & started building!
88+99+## Overview
1010+1111+Feed Generators are services that provide custom algorithms to users through the AT protocol.
1212+1313+They work very simply: the server receives a request from a user's server and returns a list of [post URIs](https://atproto.com/specs/at-uri-scheme) with some optional metadata attached. Those posts are then "hydrated" into full objects by the requesting server and sent back to the client. This route is described in the `com.atproto.feed.getFeedSkeleton` lexicon. (@TODO insert link)
1414+1515+Think of Feed Generators like a user with an API attached. Like atproto users, a Feed Generator is identified by a DID/handle and uses a data repository which holds information like its profile. However, a Feed Generator's DID Document also declares a `#bsky_fg` service endpoint that fulfills the interface for a Feed Generator.
1616+1717+The general flow of providing a custom algorithm to a user is as follows:
1818+- A user requests a feed from their server (PDS). Let's say the feed is identified by `@custom-algo.xyz`
1919+- The PDS resolves `@custom-algo.xyz` to its corresponding DID document
2020+- The PDS sends a `getFeedSkeleton` request to the service endpoint with ID `#bsky_fg`
2121+ - This request is authenticated by a JWT signed by the user's repo signing key
2222+- The Feed Generator returns a skeleton of the feed to the user's PDS
2323+- The PDS hydrates the feed (user info, post contents, aggregates, etc)
2424+ - In the future, the PDS will hydrate the feed with the help of an App View, but for now the PDS handles hydration itself
2525+- The PDS returns the hydrated feed to the user
2626+2727+To the user this should feel like visiting a page in the app. Once they subscribe, it will appear in their home interface as one of their available feeds.
2828+2929+## Getting Started
3030+3131+For now, your algorithm will need to have an account & repository on the `bsky.social` PDS.
3232+3333+First, edit the provided `setup.json` to include your preferred handle, password & invite code, along with the hostname that you will be running this server at. Then run with `yarn setup`.
3434+3535+If you need an invite code, please reach out to a Bluesky team member & inform them that you are building a Feed Generator
3636+3737+Note: _do not_ use your handle/password from you personal bluesky account. This is a _new account_ for the Feed Generator.
3838+3939+We've setup this simple server with sqlite to store & query data. Feel free to switch this out for whichever database you prefer.
4040+4141+Next you will need to do two things:
4242+4343+- Implement indexing logic in `src/subscription.ts`.
4444+4545+This will subscribe to the repo subscription stream on startup, parse event & index them according to your provided logic
4646+4747+- Implement feed generation logic in `src/feed-generation.ts`
4848+4949+The types are in place and you will just need to return something that satisfies the `SkeletonFeedPost[]` type
5050+5151+For inspiration, we've provided a very simple feed algorithm that returns recent posts from the Bluesky team.
5252+5353+## Some Details
5454+5555+### Skeleton Metadata
5656+5757+The skeleton that a Feed Generator puts together is, in its simplest form, a list of post uris.
5858+5959+```ts
6060+[
6161+ {post: 'at://did:example:1234/app.bsky.feed.post/1'},
6262+ {post: 'at://did:example:1234/app.bsky.feed.post/2'},
6363+ {post: 'at://did:example:1234/app.bsky.feed.post/3'}
6464+]
6565+```
6666+6767+However, we include two locations to attach some additional context. Here is the full schema:
6868+6969+```ts
7070+type SkeletonItem = {
7171+ post: string // post URI
7272+7373+ // optional metadata about the thread that this post is in reply to
7474+ replyTo?: {
7575+ root: string, // reply root URI
7676+ parent: string, // reply parent URI
7777+ }
7878+7979+ // optional reason for inclusion in the feed
8080+ // (generally to be displayed in client)
8181+ reason?: Reason
8282+}
8383+8484+// for now, the only defined reason is a repost, but this is open to extension
8585+type Reason = ReasonRepost
8686+8787+type ReasonRepost = {
8888+ $type: @TODO
8989+ by: string // the did of the reposting user
9090+ indexedAt: string // the time that the repost took place
9191+}
9292+```
9393+9494+This metadata serves two purposes:
9595+9696+1. To aid the PDS in hydrating all relevant post information
9797+2. To give a cue to the client in terms of context to display when rendering a post
9898+9999+### Authentication
100100+101101+If you are creating a generic feed that does not differ for different users, you do not need to check auth. But if a user's state (such as follows or likes) is taken into account, we _strongly_ encourage you to validate their auth token.
102102+103103+Users are authenticated with a simple JWT signed by the user's repo signing key.
104104+105105+This JWT header/payload takes the format:
106106+```ts
107107+const header = {
108108+ type: "JWT",
109109+ alg: "ES256K" // (key algorithm) - in this case secp256k1
110110+}
111111+const payload = {
112112+ iss: "did:example:alice", // (issuer) the requesting user's DID
113113+ aud: "did:example:feedGenerator", // (audience) the DID of the Feed Generator
114114+ exp: 1683643619 // (expiration) unix timestamp in seconds
115115+}
116116+```
117117+118118+We provide utilities for verifying user JWTs in `@TODO_PACKAGE`
119119+120120+### Pagination
121121+You'll notice that the `getFeedSkeleton` method returns a `cursor` in its response & takes a `cursor` param as input.
122122+123123+This cursor is treated as an opaque value & fully at the Feed Generator's discretion. It is simply pased through he PDS directly to & from the client.
124124+125125+We strongly encourage that the cursor be _unique per feed item_ to prevent unexpected behavior in pagination.
126126+127127+We recommend, for instance, a compound cursor with a timestamp + a CID:
128128+`1683654690921::bafyreia3tbsfxe3cc75xrxyyn6qc42oupi73fxiox76prlyi5bpx7hr72u`
129129+130130+## Suggestions for Implementation
131131+132132+How a feed generator fulfills the `getFeedSkeleton` request is completely at their discretion. At the simplest end, a Feed Generator could supply a "feed" that only contains some hardcoded posts.
133133+134134+For most usecases, we recommend subscribing to the firehose at `com.atproto.sync.subscribeRepos`. This websocket will send you every record that is published on the network. Since Feed Generators do not need to provide hydrated posts, you can index as much or as little of the firehose as necessary.
135135+136136+Depending on your algorithm, you likely do not need to keep posts around for long. Unless your algorithm is intended to provide "posts you missed" or something similar, you can likely garbage collect any data that is older than 48 hours.
137137+138138+Some examples:
139139+140140+### Reimplementing What's Hot
141141+To reimplement "What's Hot", you may subscribe to the firehose & filter for all posts & likes (ignoring profiles/reposts/follows/etc). You would keep a running tally of likes per post & when a PDS requests a feed, you would send the most recent posts that pass some threshold of likes.
142142+143143+### A Community Feed
144144+You might create a feed for a given community by compiling a list of DIDs within that community & filtering the firehose for all posts from users within that list.
145145+146146+### A Topical Feed
147147+To implement a topical feed, you might filter the algorithm for posts and pass the post text through some filtering mechanism (an LLM, a keyword matcher, etc) that filters for the topic of your choice.
···11+/**
22+ * GENERATED CODE - DO NOT MODIFY
33+ */
44+import { ValidationResult, BlobRef } from '@atproto/lexicon'
55+import { lexicons } from '../../../../lexicons'
66+import { isObj, hasProp } from '../../../../util'
77+import { CID } from 'multiformats/cid'
88+99+/** Metadata tag on an atproto resource (eg, repo or record) */
1010+export interface Label {
1111+ /** DID of the actor who created this label */
1212+ src: string
1313+ /** AT URI of the record, repository (account), or other resource which this label applies to */
1414+ uri: string
1515+ /** optionally, CID specifying the specific version of 'uri' resource this label applies to */
1616+ cid?: string
1717+ /** the short string name of the value or type of this label */
1818+ val: string
1919+ /** if true, this is a negation label, overwriting a previous label */
2020+ neg?: boolean
2121+ /** timestamp when this label was created */
2222+ cts: string
2323+ [k: string]: unknown
2424+}
2525+2626+export function isLabel(v: unknown): v is Label {
2727+ return (
2828+ isObj(v) &&
2929+ hasProp(v, '$type') &&
3030+ v.$type === 'com.atproto.label.defs#label'
3131+ )
3232+}
3333+3434+export function validateLabel(v: unknown): ValidationResult {
3535+ return lexicons.validate('com.atproto.label.defs#label', v)
3636+}
···11+/**
22+ * GENERATED CODE - DO NOT MODIFY
33+ */
44+import express from 'express'
55+import { ValidationResult, BlobRef } from '@atproto/lexicon'
66+import { lexicons } from '../../../../lexicons'
77+import { isObj, hasProp } from '../../../../util'
88+import { CID } from 'multiformats/cid'
99+import { HandlerAuth } from '@atproto/xrpc-server'
1010+import * as ComAtprotoLabelDefs from './defs'
1111+1212+export interface QueryParams {
1313+ /** List of AT URI patterns to match (boolean 'OR'). Each may be a prefix (ending with '*'; will match inclusive of the string leading to '*'), or a full URI */
1414+ uriPatterns: string[]
1515+ /** Optional list of label sources (DIDs) to filter on */
1616+ sources?: string[]
1717+ limit: number
1818+ cursor?: string
1919+}
2020+2121+export type InputSchema = undefined
2222+2323+export interface OutputSchema {
2424+ cursor?: string
2525+ labels: ComAtprotoLabelDefs.Label[]
2626+ [k: string]: unknown
2727+}
2828+2929+export type HandlerInput = undefined
3030+3131+export interface HandlerSuccess {
3232+ encoding: 'application/json'
3333+ body: OutputSchema
3434+}
3535+3636+export interface HandlerError {
3737+ status: number
3838+ message?: string
3939+}
4040+4141+export type HandlerOutput = HandlerError | HandlerSuccess
4242+export type Handler<HA extends HandlerAuth = never> = (ctx: {
4343+ auth: HA
4444+ params: QueryParams
4545+ input: HandlerInput
4646+ req: express.Request
4747+ res: express.Response
4848+}) => Promise<HandlerOutput> | HandlerOutput
···11+/**
22+ * GENERATED CODE - DO NOT MODIFY
33+ */
44+export function isObj(v: unknown): v is Record<string, unknown> {
55+ return typeof v === 'object' && v !== null
66+}
77+88+export function hasProp<K extends PropertyKey>(
99+ data: object,
1010+ prop: K,
1111+): data is Record<K, unknown> {
1212+ return prop in data
1313+}