# Parakeet Parakeet is a [Bluesky](https://bsky.app) [AppServer](https://atproto.wiki/en/wiki/reference/core-architecture/appview) aiming to implement most of the functionality required to support the Bluesky client. Notably not implemented is a CDN. ## Status and Roadmap Most common functionality works. Future work is tracked in issues and on the [Skyboard](https://skyboard.dev/board/did:plc:rjo3yss3misbrdmpuc2eza6s/3meink5vuu22k), but the highlights are below. Help would be highly appreciated. - Notifications - Search - Control Panel ## The Code Parakeet is implemented in Rust, using Postgres as a database, Redis for caching and queue processing, RocksDB for aggregation, and Diesel for migrations and querying. This repo is one big Rust workspace, containing nearly everything required to run and support the AppServer. ### Packages - consumer: Relay indexer, Label consumer, Backfiller. Takes raw records in from repos and stores them. - dataloader-rs: a vendored fork of https://github.com/cksac/dataloader-rs, with some tweaks to fit caching requirements. - lexica: Rust types for the relevant lexicons[sic] for Bluesky. - parakeet: The core AppServer code. Using Axum and Diesel. - parakeet-db: Database types and models, also the Diesel schema. - parakeet-index: Stats aggregator based on RocksDB. Uses gRPC with tonic. There is also a dependency on a fork of [jsonwebtoken](https://gitlab.com/parakeet-social/jsonwebtoken) until upstream supports ES256K. ## Running The "most supported" way is currently Nix - see below for more info. The Docker images may not build currently!. Prebuilt docker images are published (semi) automatically by GitLab CI at https://gitlab.com/parakeet-social/parakeet. Use `registry.gitlab.com/parakeet-social/parakeet/[package]:[branch]` in your docker-compose.yml. There is currently no versioning until the project is more stable (sorry). You can also just build with cargo. To run, you'll need Postgres (version 16 or higher), Redis or a Redis-like, consumer, parakeet, and parakeet-index. ### Nix A Nix Flake is provided, so you can add `git+https://tangled.org/parakeet.at/parakeet` to your flake.nix inputs. Packages `parakeet-appview` (crates/parakeet), `parakeet-consumer` (crates/consumer), and `parakeet-index` (crates/parakeet-index) are provided, and also systemd services for NixOS. You can see an example in [Mia's dotfiles](https://tangled.org/mia.pds.parakeet.at/config/blob/main/hosts/keldeo/services/parakeet.nix). The flake is configured to set up a basic environment using `nix develop` (currently only cargo and friends, not postgres). ### Configuring There are quite a lot of environment variables, although sensible defaults are provided when possible. Variables are prefixed by `PK`, `PKC`, or `PKI` depending on if they're used in Parakeet, Consumer, or parakeet-index, respectively. Some are common to two or three parts, and are marked accordingly. | Variable | Default | Description | |-------------------------------------|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------| | (PK/PKC)_INDEX_URI | n/a | Required. URI of the parakeet-index instance in format `[host]:[port]` | | (PK/PKC)_REDIS_URI | n/a | Required. URI of Redis (or compatible) in format `redis://[host]:[port]` | | (PK/PKC)_PLC_DIRECTORY | `https://plc.directory` | Optional. A PLC mirror or different instance to use when resolving did:plc. | | PKC_DATABASE__URL | n/a | Required. URI of Postgres in format `postgres://[user]:[pass]@[host]:[port]/[db]` | | PKC_UA_CONTACT | n/a | Recommended. Some contact details (email / bluesky handle / website) to add to User-Agent. | | PKC_LABEL_SOURCE | n/a | Required if consuming Labels. A labeler or label relay to consume. | | PKC_RESUME_PATH | n/a | Required if consuming relay or label firehose. Where to store the cursor data. | | PKC_INDEXER__RELAY_SOURCE | n/a | Required if consuming relay. Relay to consume from. | | PKC_INDEXER__HISTORY_MODE | n/a | Required if consuming relay. `backfill_history` or `realtime` depending on if you plan to backfill when consuming record data from a relay. | | PKC_INDEXER__INDEXER_WORKERS | 4 | How many workers to spread indexing work between. 4 or 6 usually works depending on load. Ensure you have enough DB connections available. | | PKC_INDEXER__START_COMMIT_SEQ | n/a | Optionally, the relay sequence to start consuming from. Overridden by the data in PKC_RESUME_PATH, so clear that first if you reset. | | PKC_INDEXER__SKIP_HANDLE_VALIDATION | false | Should the indexer SKIP validating handles from `#identity` events. | | PKC_INDEXER__REQUEST_BACKFILL | false | Should the indexer request backfill when relevant. Only when `backfill_history` set. You likely want TRUE, unless you're manually controlling backfill queues. | | PKC_BACKFILL__WORKERS | 4 | How many workers to use when backfilling into the DB. Ensure you have enough DB connections available as one is created per worker. | | PKC_BACKFILL__SKIP_AGGREGATION | false | Whether to skip sending aggregation to parakeet-index. Does not remove the index requirement. Useful when developing. | | PKC_BACKFILL__DOWNLOAD_WORKERS | 25 | How many workers to use to download repos for backfilling. | | PKC_BACKFILL__DOWNLOAD_BUFFER | 25000 | How many repos to download and queue. | | PKC_BACKFILL__DOWNLOAD_TMP_DIR | n/a | Where to download repos to. Ensure there is enough space. | | PKC_METRICS_PORT | 9000 | Port to bind to for Prometheus metrics in Consumer | | (PK/PKI)_SERVER__BIND_ADDRESS | `0.0.0.0` | Address for the server to bind to. For index outside of docker, you probably want loopback as there is no auth. | | (PK/PKI)_SERVER__PORT | PK: 6000, PKI: 6001 | Port for the server to bind to. | | (PK/PKI)_DATABASE_URL | n/a | Required. URI of Postgres in format `postgres://[user]:[pass]@[host]:[port]/[db]` | | PK_SERVICE__DID | n/a | DID for the AppServer in did:web. (did:plc is possible but untested) | | PK_SERVICE__PUBLIC_KEY | n/a | Public key for the AppServer. Unsure if actually used, but may be required by PDS. | | PK_SERVICE__ENDPOINT | n/a | HTTPS publicly accessible endpoint for the AppServer. | | PK_TRUSTED_VERIFIERS | n/a | Optionally, trusted verifiers to use. For many, join with `,`. | | PK_CDN__BASE | `https://cdn.bsky.app` | Optionally, base URL for a Bluesky compatible CDN | | PK_CDN__VIDEO_BASE | `https://video.bsky.app` | Optionally, base URL for a Bluesky compatible video CDN | | PK_DID_ALLOWLIST | n/a | Optional. If set, controls which DIDs can access the AppServer. For many, join with `,` | | PK_MIGRATE | false | Set to TRUE to run database migrations automatically on start. | | PKI_INDEX_DB_PATH | n/a | Required. Location to store the index database. |