Social cloud hosting
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

Design Decisions#

This document captures key architectural decisions for at-rund.


Isolation Backends#

Decision: Support multiple isolation backends (none, container, firecracker) behind a common Executor interface.

Date: 2026-05-02

Context#

at-rund needs to execute untrusted code safely. The gold standard is VM-level isolation (Firecracker), but this requires:

  • Linux host
  • KVM support (/dev/kvm)
  • Bare-metal or nested virtualization

Most VPS providers (Linode, Contabo, standard DigitalOcean) don't support nested virtualization. Requiring bare-metal servers would severely limit adoption.

Options Considered#

Option Isolation Requirements Barrier to Entry
Firecracker VM-level (strongest) Linux + KVM High (bare-metal only)
gVisor Syscall interception Linux only Medium
Containers + seccomp Namespace + syscall filtering Linux + Docker/Podman Low
Direct execution None (permissions only) Any OS + Nix Lowest

Decision#

Support all three levels behind a common interface:

type Executor interface {
    Execute(req ExecuteRequest, mimeType string) (*ExecuteResponse, error)
    Stats() PoolStats
    Warm(count int) error
    Drain()
    Shutdown()
}

Implementations:

  • NixPool — Direct execution via Nix, no isolation
  • ContainerPool — OCI containers with debian-slim base + seccomp
  • FirecrackerPool — Firecracker microVMs with virtio-fs

Auto-Detection#

When isolation = "auto" (default):

1. Check /dev/kvm exists and accessible
   └─ Yes → FirecrackerPool
   └─ No  → Continue

2. Check docker/podman available
   └─ Yes → ContainerPool
   └─ No  → Continue

3. Fallback → NixPool (with warning)

Runtime Artifacts#

Nix builds each runtime as multiple artifacts:

nix/runtimes/deno/
├── flake.nix
├── executor.ts          # Runtime-specific executor
└── outputs:
    ├── at-run-exec      # Direct execution wrapper
    ├── image.tar        # OCI image for containers
    └── rootfs.ext4      # Filesystem for Firecracker

The same Nix expression produces all three, ensuring identical runtime behavior.

Security Model#

Backend Kernel Shared Escape Risk Good For
none Yes High Dev/testing, trusted code
container Yes Medium Production, social trust
firecracker No Very Low High-security, untrusted code

The at-rund trust model assumes operators choose who can run code on their infrastructure. This means:

  • container isolation is reasonable for most deployments
  • Operators handling truly untrusted code should use firecracker on bare-metal
  • none is for development only

Consequences#

Positive:

  • Any Linux VPS can run at-rund in production (containers)
  • Bare-metal operators get strongest isolation (Firecracker)
  • Developers can test on macOS/Windows (via Docker or direct)
  • Common interface means bundles work identically everywhere

Negative:

  • More code to maintain (3 backends)
  • Container isolation is weaker than VMs
  • Need to document security tradeoffs clearly

Runtime Executor Pattern#

Decision: Each runtime defines its own at-run-exec that handles permission translation and execution.

Date: 2026-05-01

Context#

Different runtimes handle permissions differently:

  • Deno: --allow-net=host1,host2
  • Node: Environment variables or custom sandbox
  • Python: No built-in sandboxing

Initially, we tried handling this in Go code, but this meant:

  • Hardcoded permission logic per runtime
  • Changes required recompiling at-rund
  • No way for operators to customize

Decision#

Move executor logic into the Nix runtime definition:

at-rund (Go)                    Runtime (Nix)
     │                               │
     │  1. Build ExecRequest JSON    │
     │                               │
     └────── stdin ─────────────────▶│ at-run-exec
                                     │   - Parse request
                                     │   - Translate permissions
                                     │   - Import bundle
                                     │   - Call endpoint
                                     │   - Return JSON response
     ◀────── stdout ─────────────────┘

The at-run-exec script is runtime-specific:

  • Deno: TypeScript that builds --allow-* flags
  • Node: Could use vm2 or similar
  • Python: Could use seccomp or RestrictedPython

ExecRequest Format#

{
  "codePath": "/path/to/bundle.js",
  "endpoint": "handleRequest",
  "args": { "input": "data" },
  "permissions": {
    "net": ["api.example.com", "*.cdn.com"],
    "read": ["/tmp/cache"],
    "write": ["/tmp/output"],
    "env": ["API_KEY", "DEBUG"]
  },
  "env": {
    "API_KEY": "decrypted-secret"
  },
  "timeout": 30
}

ExecResponse Format#

{
  "success": true,
  "data": { "result": "value" },
  "response": {
    "status": 200,
    "headers": { "content-type": "application/json" },
    "body": "...",
    "isBase64": false
  },
  "error": "error message if success=false",
  "metrics": {
    "executionTimeMs": 142
  }
}

Consequences#

Positive:

  • Runtime authors control permission translation
  • Operators can customize/fork runtimes
  • at-rund stays runtime-agnostic
  • Easy to add new runtimes

Negative:

  • More complexity in runtime definitions
  • Permission bugs are per-runtime, not centralized

Social Trust Model#

Decision: Operators explicitly choose who can run code; there is no automatic discovery or federation.

Date: 2026-05-01

Context#

Traditional serverless (AWS Lambda, Cloudflare Workers) has a clear trust model: you trust the provider, period. Decentralized alternatives often propose automatic discovery and federation, which creates new trust problems.

Decision#

Trust is explicit and social:

  1. Operators choose bundles: Via allowlist, blocklist, or open mode
  2. Authors choose runners: By manually configuring which runner(s) to use
  3. No automatic discovery: Runners don't announce themselves to a registry
  4. No federation: Each runner is independent

This mirrors how the AT Protocol itself works — you follow people you know, not everyone.

Access Control#

[access]
mode = "allowlist"  # or "blocklist" or "open"

# Only these DIDs can run bundles
allowlist = [
  "did:plc:friend1",
  "did:plc:friend2",
]

Consequences#

Positive:

  • Simple mental model
  • Operators have full control
  • No Sybil attacks on "discovery"
  • Matches AT Protocol philosophy

Negative:

  • Authors must manually find runners
  • No marketplace/registry (for now)
  • Fragmented ecosystem possible

Future Consideration#

Optional announcement could be added later:

  • Post runner capabilities to your PDS
  • Others discover via social graph
  • Still no central registry

Nix for Runtime Builds#

Decision: Use Nix to define and build all runtime environments.

Date: 2026-05-01

Context#

Bundles must behave identically across:

  • Development (macOS, direct execution)
  • Container isolation (Linux VPS)
  • VM isolation (bare-metal)

Docker images alone don't solve this — you still need to define what goes in them, and dev machines often can't run Docker efficiently.

Decision#

Nix defines each runtime declaratively:

{
  packages = [ pkgs.deno pkgs.jq ];
  
  # Script that handles execution
  at-run-exec = writeShellScriptBin "at-run-exec" ''
    ...
  '';
  
  # OCI image for containers
  ociImage = dockerTools.buildImage { ... };
  
  # Rootfs for Firecracker
  rootfs = makeExt4 { ... };
}

One definition, multiple outputs, identical behavior.

Consequences#

Positive:

  • Reproducible builds
  • Same runtime dev → prod
  • Easy to customize/extend
  • Hermetic (no "works on my machine")

Negative:

  • Nix has steep learning curve
  • Build times can be long (first time)
  • Extra dependency for operators

Future Decisions#

Topics that need decisions as development continues:

  • Secrets management: How are secrets encrypted/decrypted?
  • Bundle signing: Should bundles be signed? By whom?
  • Rate limiting: Built-in or middleware only?
  • Multi-tenancy: Separate pools per DID?
  • Persistence: Stateful bundles? How?