Social cloud hosting
1# Roadmap
2
3This document outlines the development roadmap for at-rund.
4
5## Current Status: Alpha (Under Heavy Construction)
6
7The core architecture is in place. Dev mode works on macOS/Linux with Nix. Production isolation is being implemented with multiple backends.
8
9**See [DESIGN.md](./DESIGN.md) for architectural decisions.**
10
11---
12
13## Phase 1: Core Functionality ✅
14
15**Goal:** A working end-to-end system where bundles can be fetched from a PDS and executed.
16
17### ATProto Integration
18- [x] DID resolution (did:plc, did:web)
19- [x] PDS client for fetching bundle records
20- [x] Bundle blob fetching and caching
21- [x] Manifest parsing (permissions, runtime, limits)
22
23### Bundle Execution
24- [x] Wire up executor to HTTP routes
25- [x] Permission enforcement (net, read, write, env)
26- [ ] Resource limits (memory, CPU, timeout)
27- [ ] Secrets decryption and injection
28
29### Dev Mode
30- [x] Nix-based execution (NixPool)
31- [x] Auto-detection of capabilities
32- [x] Runtime executor pattern (at-run-exec)
33- [ ] Hot reload for local development
34- [ ] Better error messages
35
36---
37
38## Phase 2: Production Isolation
39
40**Goal:** Multiple isolation backends to balance security vs. accessibility.
41
42```
43isolation = "auto" | "none" | "container" | "firecracker"
44```
45
46### Container Backend (In Progress)
47- [ ] ContainerPool executor implementation
48- [ ] OCI image building via Nix (debian-slim base)
49- [ ] Docker/Podman runtime detection
50- [ ] seccomp profiles for syscall filtering
51- [ ] Network namespace isolation
52- [ ] Permission enforcement via container config
53
54### Firecracker Backend (Future)
55- [ ] FirecrackerPool executor implementation
56- [ ] Kernel + rootfs image building via Nix
57- [ ] VM lifecycle management (spawn, stop, reuse)
58- [ ] virtio-fs for bundle mounting
59- [ ] vsock for host ↔ guest communication
60- [ ] Guest agent (Go binary inside VMs)
61
62### Shared Infrastructure
63- [ ] Auto-detection logic (KVM → container → none)
64- [ ] Pre-warming (configurable per runtime)
65- [ ] Idle timeout and reclamation
66- [ ] Max instance limits
67- [ ] Graceful drain on shutdown
68- [ ] Network proxy with permission enforcement
69
70---
71
72## Phase 3: Observability
73
74**Goal:** Operators can monitor their runners effectively.
75
76### Metrics
77- [ ] OpenTelemetry integration
78- [ ] Request count, latency, error rate
79- [ ] Per-bundle, per-DID breakdowns
80- [ ] VM pool utilization
81- [ ] Resource usage (memory, CPU)
82
83### Logging
84- [ ] Structured JSON logs
85- [ ] Request tracing (trace IDs)
86- [ ] Bundle execution logs (opt-in)
87
88### Dashboard
89- [ ] Example Grafana dashboard
90- [ ] Prometheus scrape endpoint (`/metrics`)
91
92---
93
94## Phase 4: Operator Experience
95
96**Goal:** Make it easy to run a production at-rund instance.
97
98### Deployment
99- [x] systemd service support
100- [ ] Docker image
101- [ ] Nix flake for NixOS deployment
102- [ ] Ansible/Terraform examples
103
104### Configuration
105- [ ] Config validation on startup
106- [ ] Reload config without restart (SIGHUP)
107- [ ] Environment variable overrides
108
109### Security
110- [ ] Security hardening guide
111- [ ] Firewall recommendations
112- [ ] TLS termination examples (nginx, caddy)
113
114---
115
116## Phase 5: Advanced Features
117
118**Goal:** Features for larger-scale or specialized deployments.
119
120### Tasks & Jobs
121- [ ] Port task queue from at-run v1
122- [ ] Background job execution
123- [ ] Cron scheduling
124- [ ] Result caching
125
126### Multi-Node
127- [ ] Shared state (Redis, SQLite)
128- [ ] Load balancing considerations
129- [ ] Sticky sessions for stateful bundles
130
131### Custom Runtimes
132- [ ] Runtime marketplace/registry (community-contributed)
133- [ ] Documentation for writing runtimes
134- [ ] Testing framework for runtimes
135
136---
137
138## Phase 6: Ecosystem
139
140**Goal:** at-rund becomes part of a thriving ecosystem.
141
142### Discovery
143- [ ] Runner announcement protocol (optional)
144- [ ] Capability advertisement (runtimes, limits)
145- [ ] Uptime/health signaling
146
147### Developer Experience
148- [ ] `at-run test --runner <url>` for testing against remote runners
149- [ ] Bundle compatibility checker
150- [ ] Performance profiling
151
152### Documentation
153- [ ] Operator guide
154- [ ] Security model explanation
155- [ ] Troubleshooting guide
156- [ ] Video tutorials
157
158---
159
160## Non-Goals (For Now)
161
162These are explicitly out of scope for the initial releases:
163
164- **Automatic runner discovery** — Trust is social; discovery is manual
165- **Payment/billing integration** — Use middleware if needed
166- **Multi-region orchestration** — Each runner is independent
167- **Bundle validation/signing** — Trust the author, not the code
168- **Centralized registry** — Bundles live on user PDSes
169
170---
171
172## Contributing
173
174We welcome contributions! Areas where help is especially appreciated:
175
1761. **Runtime definitions** — Create Nix configs for new runtimes
1772. **Testing** — Run at-rund and report issues
1783. **Documentation** — Improve guides and examples
1794. **Firecracker expertise** — Help with VM integration
180
181See [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.
182
183---
184
185## Version History
186
187| Version | Status | Notes |
188|---------|--------|-------|
189| 0.1.0 | Alpha | Initial scaffolding, dev mode works |
190| 0.2.0 | — | ATProto integration, bundle execution |
191| 0.3.0 | — | Firecracker production mode |
192| 0.4.0 | — | Observability (OTLP, metrics) |
193| 1.0.0 | — | Production ready |