Mirror of https://github.com/roostorg/osprey
github.com/roostorg/osprey
1# Setup Guide
2
3This guide provides comprehensive instructions for setting up a development environment for Osprey.
4
5## Prerequisites
6
7- **Operating System**: macOS, Linux, or Windows (with WSL recommended)
8- **[Python](https://www.python.org/) 3.11 or higher** (check with `python --version`)
9- **[Git](https://git-scm.com/)** for version control
10- **[uv](https://docs.astral.sh/uv/)** for Python package management
11- **[npm](https://nodejs.org/en/download)**
12
13## Project Setup
14
15### 1. Clone the Repository
16
17```bash
18git clone git@github.com:roostorg/osprey.git
19cd osprey
20```
21
22### 2. Install Dependencies
23
24```bash
25# Install all dependencies including development tools
26uv sync
27```
28
29This command will:
30
31- Create a virtual environment automatically
32- Install all production dependencies
33- Install development dependencies (ruff, mypy, pre-commit) automatically
34- Use the locked versions from `uv.lock` for reproducible builds
35
36**Note**: `uv sync` includes development dependencies by default. Use `uv sync --no-dev` if you only want production dependencies.
37
38### 3. Set Up Pre-commit Hooks
39
40```bash
41uv run pre-commit install
42```
43
44This installs git hooks that automatically run code quality checks before each commit.
45
46### 4. Verify Setup
47
48Run these commands to ensure everything is working correctly:
49
50```bash
51# Check linting configuration
52uv run ruff check
53
54# Check formatting
55uv run ruff format --diff
56
57# Run type checking
58uv run mypy .
59
60# Test pre-commit hooks
61uv run pre-commit run --all-files
62```
63
64**Expected Results:**
65
66- Ruff should report "All checks passed!" or show specific issues to fix
67- MyPy should run without errors
68- Pre-commit should run all hooks successfully
69
70### 5. Getting Started
71
72```bash
73docker compose up -d
74```
75
76or using the wrapper script
77
78```bash
79./start.sh
80```
81
82This starts up many services, including:
83- **Osprey Worker**: The main engine that processes input events given the rules and UDFs
84 - **Test Data Producer**: Optional with `--profile test_data`
85- **Osprey UI**: Frontend service that hosts the react code for the web interface and communicates to the UI API
86- **Osprey UI API**: Backend service that provides data and functionality to the web interface
87- **Kafka** (KRaft mode): Message streaming for user generated events
88- **Postgres**: A database that the Worker, UI API, and Druid use for various reasons, such as the Postgres-backed Labels Service (in the example plugins)
89- **Druid**: A database that consumes Osprey Worker outputs to power the UI API for real-time querying
90
91Alternatively, you can start Osprey with `osprey-coordinator`, refer to the [Coordinator README](../example_docker_compose/run_osprey_with_coordinator/README.md) for more information
92
93### 6. (Optional) Open ports for the UI/UI API
94
95By default, the `docker-compose.yaml` binds running services to `127.0.0.1`. If you are running the docker compose on a headless machine, you may need to modify this configuration and/or make changes to your firewall, specifically for ports `5002` and `5004`.
96
97For example, if you use Tailscale to access your Osprey instance, you may change `127.0.0.1:5002:5002` to `<Tailscale IP>:5002:5002`. Alternatively, if you wish for your instance to be accessible from the public internet, you may set it simply to `5002:5002` to bind to `0.0.0.0`.
98
99Be aware that some firewalls like iptables/UFW do _not_ prevent access to ports being used by Docker networking. Not explicitly setting a bind address with only UFW as a firewall will not prevent access from the public internet unless [properly configured](https://github.com/chaifeng/ufw-docker).
100
101### 7. Access the Application
102
103The UI will automatically connect to the backend services running in Docker containers.
104
105- Osprey UI: [localhost:5002](http://localhost:5002)
106- Backend API: [localhost:5004](http://localhost:5004)
107- Worker Service: [localhost:5001](http://localhost:5001)
108
109## Plugins
110
111In Osprey, UDFs and output sinks are designed to be easily portable. This is done through a plugin system based on pluggy. An example plugin package has been provided for reference, see `example_plugins/register_plugins.py`:
112
113```python
114@hookimpl_osprey
115def register_udfs() -> Sequence[Type[UDFBase[Any, Any]]]:
116 # Register custom user-defined functions
117
118@hookimpl_osprey
119def register_output_sinks(config: Config) -> Sequence[BaseOutputSink]:
120 # Define output destinations
121 # By default it prints the execution results to the console
122
123@hookimpl_osprey
124def register_ast_validators() -> None:
125 # Register AST validators
126```
127
128## Rules
129
130Rules are written in SML, some examples are provided in `example_rules/` with YAML config, the rules are mounted to the worker processes when the containers start via environment variables. ex:
131
132```bash
133OSPREY_RULES=./example_rules uv run python3.11 osprey_worker/src/osprey/worker/cli/sinks.py run-rules-sink
134```
135
136[More about rules →](rules.md)
137
138## Test Data
139
140Generate sample JSON actions:
141```bash
142docker compose --profile test_data up osprey-kafka-test-data-producer -d
143```
144
145Produces user login events with timestamps, user IDs, and IP addresses to `osprey.actions_input` topic.