The code and data behind xeiaso.net
5
fork

Configure Feed

Select the types of activity you want to include in your feed.

blog: Why Rust, A Tale of Satori (#118)

authored by

Christine Dodrill and committed by
GitHub
8be19405 955e63fa

+233
+233
blog/why-rust-2020-02-15.markdown
··· 1 + --- 2 + title: Why Rust 3 + date: 2020-02-15 4 + tags: 5 + - rust 6 + - rant 7 + - satori 8 + - golang 9 + --- 10 + 11 + # Why Rust 12 + 13 + Or: A Trip Report from my Satori with Rust and Functional Programming 14 + 15 + Software is a very odd field to work in. It is simultaneously an abstract and 16 + physical one. You build systems that can deal with an unfathomable amount of 17 + input and output at the same time. As a job, I peer into the madness of an 18 + unthinking automaton and give order to the inherent chaos. I then emit 19 + incantations to describe what this unthinking automaton should do in my stead. I 20 + cannot possibly track the relations between a hundred thousand transactions 21 + going on in real time, much less file them appropriately so they can be summoned 22 + back should the need arise. 23 + 24 + However, this incantation (by necessity) is an _unthinkably_ precise and fickle 25 + beast. It's almost as if you are training a four-year old to go to the store, 26 + but doing it by having them read a grocery list. This grocery list has to be 27 + precise enough that the four year old ends up getting what you want and not a 28 + cart full of frosted flakes and candy bars. But, at the same time, the four year 29 + old needs to understand it. Thus, the precision. 30 + 31 + There's many schools of thought around ways to write the grocery list. Some 32 + follow a radically simple approach, relying on the toddler to figure things out 33 + at the store. Sometimes this simpler approach doesn't work out in more obscure 34 + scenarios, like when they are out of red grapes but do have green grapes, but it 35 + tends to work out enough. Proponents of these list-making tools also will 36 + advocate for doing full tests of the grocery list before they send the toddler 37 + off to the store. This means setting up a fake grocery store with funny money, a 38 + fake card, plastic food, the whole nine yards. This can get expensive and can 39 + become a logistical issue (where are you going to store all that plastic fruit 40 + in a way that you can just set up and tear down the grocery store mock so 41 + quickly?). 42 + 43 + Another school of thought is that the process of writing the grocery list should 44 + be done in a way that prevents ambiguity at the grocery store. This kind of flow 45 + uses some more advanced concepts like the ability to describe something by its 46 + attributes. For example, this could specify the difference between fruit and 47 + vegetables, and only allow fruit to be put in one place of the cart and only 48 + allow vegetables to be placed in the other. And if the writer of the list tries 49 + to violate this, the list gets rejected and isn't used at all. 50 + 51 + There is yet another school of thought that decides that the exact spatial 52 + position of the toddler relative to everything else should be thought of in 53 + advance, along with a process to make sure that nothing is done in an improper 54 + way. This means writing the list can be a lot harder at first, but it's much 55 + less likely to result in the toddler coming back with a weird state. Consider 56 + what happens if two items show up at the same time and the toddler tries to grab 57 + both of them at the same time due to the instructions in the list! They only 58 + have one arm to grab things with, so it just doesn't work. Proponents of the 59 + more strict methods have reference cells and other mechanisms to ensure that the 60 + toddler can only ever grab one thing at a time. 61 + 62 + If we were to match these three ludicrous examples to programming languages, the 63 + first would be Lua, the second would be Go and the third would be something like 64 + Haskell or Rust. Software development is a complicated process because the 65 + problems involved with directing that unthinking automaton to do what you want 66 + are hard. There is a lot going on, much in the same way there is a lot going on 67 + when you send a toddler to do your grocery shopping for you. 68 + 69 + A good way to look at the tradeoffs involved is to see things as a balance 70 + between two forces, pragmatism and correctness. Languages that are more 71 + pragmatic are easier to develop in, but are mathematically more likely to run 72 + into problems at runtime. Languages that are more correct take more investment 73 + to write up front, but over time the correctness means that there's fewer failed 74 + assumptions about what is going on. The compiler stops you from doing things 75 + that don't make sense to it. This means that it's difficult to literally 76 + impossible to create a bad state at runtime. 77 + 78 + Tools like Lua and Go can (and have) been used to develop stable and viable 79 + software. [itch.io][itchio] is written in Lua running on top of nginx and it 80 + handles financial transactions well enough that it's turned into the guy's full 81 + time job. Google uses Go everywhere in their stack, and it's been used to create 82 + powerful tools like Kubernetes, Caddy, and Docker. These tools are trusted 83 + implicitly by a generation of developers, even though the language itself has 84 + its flaws. If you are reading this blog in Firefox, statistically there is Rust 85 + involved in the rendering and viewing of this post. Rust is built for ensuring 86 + that code is _as correct as possible_, even if it means eating into development 87 + time to ensure that. 88 + 89 + [itchio]: https://itch.io 90 + 91 + In Rust, you don't have to memorize rules about how and when it is safe to 92 + update data in structures, because the compiler ensures you _cannot mess it up 93 + by rejecting the code if you could be messing it up_. You don't have to run your 94 + tests with a race detector or figure out how to expose that in production to 95 + trace down that obscure double-write to a non-threadsafe hashmap, because in 96 + Rust there is no such thing as a non-threadsafe hashmap. There is only a safe 97 + hashmap and only can ever be a safe hashmap. 98 + 99 + As an absurd example, consider the following two snippets of code, one in Go and 100 + one in Rust, both of them will put integers into a standard library list and 101 + then print them all out: 102 + 103 + ```go 104 + l := list.New() // () -> *list.List 105 + for i := 0; i < 5; i++ { 106 + l.PushBack(i) // interface{} -> () 107 + } 108 + 109 + for e := l.Front(); e != nil; e = e.Next() { 110 + log.Printf("%T: %v", e.Value, e.Value) 111 + } 112 + ``` 113 + 114 + ```rust 115 + let mut vec = Vec::new::<i64>(); // () -> Vec<i64> 116 + 117 + for i in 0..5 { 118 + vec.push(i as i64); // (mut Vec<i64>, i64) -> () 119 + } 120 + 121 + for i in vec.iter() { 122 + println!("{}", i); 123 + } 124 + ``` 125 + 126 + The Go version uses `interface{}` as the data element because Go [literally 127 + cannot describe types as parameters to functions][gonerics]. The Rust version 128 + took me a bit longer to write, but there is _no_ ambiguity as to what the vector 129 + holds. The Go version can also hold multiple types of data in the same list, 130 + a-la: 131 + 132 + [gonerics]: https://golang.org/doc/faq#generics 133 + 134 + ```go 135 + l := list.New() 136 + l.PushBack(42) 137 + l.PushBack("hotdogs") 138 + l.PushBack(420.69) 139 + ``` 140 + 141 + All of which is valid because in Go, an `interface{}` matches _every kind of 142 + value possible_. An integer is an `interface{}`. A floating-point number is an 143 + `interface{}`. A string is an `interface{}`. A bool is an `interface{}`. Any 144 + custom type you create is an `interface{}`. Normally, this would be very 145 + restrictive and make it difficult to do things like JSON parsing. However the Go 146 + runtime lets you hack around this with [reflection][wtfisreflection]. 147 + 148 + [wtfisreflection]: https://golangbot.com/reflection/ 149 + 150 + This allows the standard library to handle things like JSON parsing with 151 + functions [that look like this](https://godoc.org/encoding/json#Unmarshal): 152 + 153 + ``` 154 + func Unmarshal(data []byte, v interface{}) error 155 + ``` 156 + 157 + There's even a set of complicated rules you need to memorize about how to trick 158 + the JSON parser into massaging your data into place. This lets you do things 159 + like this: 160 + 161 + ```go 162 + type Rilkef struct { 163 + Foo string `json:"foo"` 164 + CallToArms string `json:"call_to_arms"` 165 + } 166 + ``` 167 + 168 + This allows the programmer a lot of flexibility while developing and compiling 169 + the code. It's very easy for the compiler to say "oh, hey, that could be 170 + anything, and you gave it some kind of anything, sounds legit to me", but then 171 + the job of ensuring the sanity of the inputs is shunted to _runtime_ rather than 172 + stopped before the code gets deployed. This means you need to test the code in 173 + order to see how it behaves, making sure that _the standard library is doing its 174 + job correctly_. This kind of stuff does not happen in Rust. 175 + 176 + The Rust version of this JSON example uses the [serde][serde] and 177 + [serde_json][serdejson] libraries: 178 + 179 + [serde]: https://serde.rs 180 + [serdejson]: https://serde.rs/json.html 181 + 182 + ```rust 183 + use serde::*; 184 + 185 + #[derive(Serialize, Deserialize)] 186 + pub struct Rilkef { 187 + pub foo: String, 188 + pub call_to_arms: String, 189 + } 190 + ``` 191 + 192 + And the logic for handling the correct rules for serialization and 193 + deserialization is handled at _compile time_ by the compiler itself. Serde also 194 + allows you to support more than just JSON, so this same type can be reused for 195 + Dhall, YAML or whatever you could imagine. 196 + 197 + ## tl;dr 198 + 199 + Rust allows for more correctness at the cost of developer efficiency. This is a 200 + tradeoff, but I think it may actually be worth it. Code that is more correct is 201 + more robust and less prone to failure than code that is less correct. This leads 202 + to software that is less likely to crash at 3 am and wake you up due to a 203 + preventable developer error. 204 + 205 + After working in Go for more than half a decade, I'm starting to think that it 206 + is probably a better idea to impact developer velocity and force them to write 207 + software that is more correct. Go works if you are careful about how you handle 208 + it. It however amounts to a giant list of rules that you just have to know (like 209 + maps not being threadsafe) and a lot of those rules come from battle rather than 210 + from the development process. 211 + 212 + This came out as more of a rant than I had thought it would, but overall I hope 213 + my point isn't lost. 214 + 215 + ### Things You Might Complain About 216 + 217 + Yes, I know slices exist in Go. I wanted to prove a point about how the overuse 218 + of `interface{}` in some relatively core things (like generic lists) can cause 219 + headaches in term of correctness. Go will reject you trying to append a string 220 + to an integer slice, but you cannot create a type that functions identically to 221 + an integer slice. 222 + 223 + Go does have a race detector that will point out a lot of sins in concurrent 224 + programs, but that is again at _runtime_, not at _compile time_. 225 + 226 + --- 227 + 228 + Many thanks to Tene, Sr. Oracle, A. Wilfox, Byte-slice, SiIvagunner and anyone 229 + who watched the stream where I wrote this blogpost. If I got things wrong in 230 + this, please [reach out to me](/contact) to let me know what I messed up. This 231 + is a composite of a few twitter threads and a conversation I had on IRC. 232 + 233 + Thanks for reading, be well.