this repo has no description
1# Cloudlab
2
3> [!IMPORTANT]
4> This project is designed to manage my offsite setup, which is specific to my
5> use cases, so it might not be directly useful to you. For a ready-to-use
6> solution, please refer to my [homelab project](https://github.com/khuedoan/homelab).
7
8## Project structure
9
10```
11.
12├── flake.nix # Contains dependencies required by this project for both local and CI/CD
13├── Makefile # Entry point for all manual actions
14├── compose.yaml # Servers required for running locally
15├── infra # Infrastructure definition
16│ ├── modules # Terraform modules
17│ │ ├── network
18│ │ ├── instance
19│ │ ├── cluster
20│ │ └── ...
21│ ├── local # Terragrunt configuration for the local environment
22│ │ └── ...
23│ └── ${ENV} # Terragrunt configuration for the ${ENV} environment
24│ ├── root.hcl # Root config used by other Terragrunt files
25│ ├── secrets.yaml # Encrypted secrets
26│ ├── tfstate # Bootstrap Terraform state
27│ ├── ${CLOUD}
28│ │ └── ${REGION}
29│ │ └── ${MODULE}
30│ │ └── terragrunt.hcl
31│ ├── metal
32│ │ └── vn-southeast-1
33│ │ ├── bootstrap
34│ │ │ └── terragrunt.hcl
35│ │ └── cluster
36│ │ └── terragrunt.hcl
37│ └── ...
38├── platform # Highly privileged platform components
39│ └── ${ENV}
40│ ├── grafana.yaml
41│ ├── temporal.yaml
42│ ├── wireguard.yaml
43│ └── ...
44├── apps # User applications, standardized with strict controls
45│ ├── ${NAMESPACE}
46│ │ └── ${APP}
47│ │ └── ${ENV}.yaml
48│ └── khuedoan
49│ └── blog
50│ ├── local.yaml
51│ └── production.yaml
52├── controller # Automation controller for the entire project - think GitHub Actions, but better
53│ ├── activities # Temporal activities (git clone, terragrunt apply, etc.)
54│ │ ├── git.go
55│ │ ├── terragrunt.go
56│ │ └── ...
57│ ├── workflows # Temporal workflows, define a sequence of activities
58│ │ ├── infra.go
59│ │ ├── app.go
60│ │ └── ...
61│ ├── worker # Worker process that executes the workflows
62│ └── Dockerfile # Builds the image for the controller, can run locally or on a cluster
63└── test # High level tests
64```
65
66## Features
67
68- Unified hybrid cloud platform
69- Temporal is used as the automation engine, providing the reliability and
70 performance that generic CI/CD engines can only dream of.
71- Infra aka IaaS:
72 - Essentially `cd "infra/${ENV}" && terragrunt apply --all`
73 - Includes some graph pruning based on changed files for performance
74 - Bootstrap ArgoCD to apply the remaining
75- Platform aka PaaS:
76 - Essentially `kubectl apply -f "platform/${ENV}"`
77 - However, the runtime doesn’t have access to Git - all manifests are pulled from an OCI registry
78- Apps aka SaaS:
79 - Strict and standardized
80 - Generated from `apps/$NAMESPACE/$APP/$ENV.yaml`
81 - Published to the cluster as a Flux OCI artifact
82
83## Estimated cost
84
85| Provider | Service | Usage | Pricing |
86| :-- | :-- | :-- | :-- |
87| Metal | Hardware depreciation | | 76.32$/year |
88| Metal | Electricity | | 36$/year |
89| Oracle Cloud | Virtual Cloud Network | 1 | Free |
90| Oracle Cloud | `VM.Standard.A1.Flex` (ARM) - 4 cores, 24GB RAM, 200GB disk | 1 | Free |
91| Hetzner | VM `CAX21` - 4 cores, 8GB RAM, 80GB disk | 1 | 83.88$/year |
92| Cloudflare | R2 Bucket (Terraform state) | 2 | Free |
93| Cloudflare | Domain | 2 | 20$/year |
94| Cloudflare | Load Balancer | 1 | 60$/year |
95| Cloudflare | Tunnel | 2 | Free |
96| Backblaze | B2 Bucket (backup) | 1TB | 72$/year |
97| **Total** | | | 348.2$/year |
98
99## Get started
100
101### Prerequisites
102
103- Fork this repository because you will need to customize it for your needs.
104- A credit/debit card to register for the accounts.
105- Basic knowledge on Terraform, Ansible and Kubernetes (optional, but will help a lot)
106
107Configuration files:
108
109<details>
110
111<summary>Terraform Cloud</summary>
112
113- Create a Terraform Cloud account at <https://app.terraform.io>
114
115</details>
116
117<details>
118
119<summary>Oracle Cloud</summary>
120
121- Create an Oracle Cloud account at <https://cloud.oracle.com>
122- Generate an API signing key:
123 - Profile menu (User menu icon) -> User Settings -> API Keys -> Add API Key
124 - Select Generate API Key Pair, download the private key to `~/.oci/private.pem` and click Add
125 - Copy the Configuration File Preview to `~/.oci/config` and change `key_file` to `~/.oci/private.pem`
126
127If you see a warning like this, try to avoid those regions:
128
129> ⚠️ Because of high demand for Arm Ampere A1 Compute capacity in the Foo and Bar regions, A1 instance availability in these regions is limited.
130> If you plan to create A1 instances, we recommend choosing another region as your home region
131
132</details>
133
134Install the following packages:
135
136- [Nix](https://nixos.org/download.html)
137
138That's it! Run the following command to open the Nix shell:
139
140```sh
141nix develop
142```
143
144### Provision
145
146Build the infrastructure:
147
148```sh
149make
150```
151
152Bootstrap secrets:
153
154```yaml
155# https://dash.cloudflare.com -> Storage & databases -> R2 -> Overview -> API Tokens -> Manage -> Create Account API Token
156cloudflare_tfstate_api_token: foo
157cloudflare_tfstate_access_key: foo
158cloudflare_tfstate_secret_key: foo
159# https://dash.cloudflare.com -> Click on account name -> Copy account ID
160cloudflare_account_id: foo
161
162# https://console.hetzner.com -> Create new project -> Security -> API tokens -> Generate API token -> Terraform (Read & Write)
163hetzner_token: foo
164
165# Proxmox login credentials
166proxmox_username: foo
167proxmox_password: foo
168```
169
170### Operations
171
172- [Rebuild environment](docs/how-to-guides/rebuild-environment.md)
173
174## TODOs
175
176- Fix OCI plain HTTP for local development
177- Config git username and email
178- Credentials for the worker (SSH priv + pub + knowhosts?)
179- Contract between clouds:
180 - Compute, x86_64 or aarch64
181 - Public IPv6
182 - Allow ports:
183 - 443
184 - 80 (for HTTP-01)
185 - 22
186 - 51820 and 51821 (for Wireguard IPv4 and IPv6)
187 - NixOS, with SSH access
188- Firewall rules (currently manually managed in routers):
189 - TCP: 6443 (Kube API), 443 (HTTPS), 80 (HTTP), 10250 (Kubelet metrics), 22 (SSH)
190 - UDP: 51820 (Wireguard IPv4), 51821 (Wireguard IPv6)
191
192## Acknowledgments and References
193
194- [Oracle Terraform Modules](https://github.com/oracle-terraform-modules)
195- [Official k3s systemd service file](https://github.com/k3s-io/k3s/blob/master/k3s.service)
196- [Sample Prometheus configuration for Istio](https://github.com/istio/istio/blob/master/samples/addons/extras/prometheus-operator.yaml)
197- [Terraform and nixos-anywhere infrastructure for wiki.nixos.org](https://github.com/NixOS/nixos-wiki-infra)