The code and data behind xeiaso.net
5
fork

Configure Feed

Select the types of activity you want to include in your feed.

the nguh compiler

Signed-off-by: Xe Iaso <me@christine.website>

Xe Iaso 3890085b 6171b772

+277 -1
+1 -1
blog/formal-grammar-of-h-2019-05-19.markdown
··· 1 1 --- 2 2 title: A Formal Grammar of h 3 3 date: 2019-05-19 4 - series: conlangs 4 + series: h 5 5 --- 6 6 7 7 ## Introduction
+1
blog/h-language-2019-06-30.markdown
··· 1 1 --- 2 2 title: The h Programming Language 3 3 date: 2019-06-30 4 + series: h 4 5 tags: 5 6 - wasm 6 7 - release
+269
blog/hlang-nguh.markdown
··· 1 + --- 2 + title: "The Next-Generation Universal Hlang compiler" 3 + date: 2022-12-31 4 + series: h 5 + tags: 6 + - hlang 7 + - wasm 8 + vod: 9 + twitch: https://www.twitch.tv/videos/1693936831 10 + youtube: https://youtu.be/QY1O2n4tOhE 11 + --- 12 + 13 + In a world where simple tasks have hundreds of dependencies and most of them are 14 + not documented, everything falls to chaos. The monolithigarchy dictates that 15 + your build times must be slow so that They (the dependocracy) can win over your 16 + hearts and minds with video games that you play during your compile times. One 17 + person gets mad about their string padding library being used by corporations 18 + without paying and then the entire internet explodes for a few days. This is 19 + unsustainable. 20 + 21 + hlang is the sledgehammer that will break down this complexity and deliver you a 22 + truly uncompromised development experience. 23 + 24 + <xeblog-conv name="Numa" mood="delet">You can't spell _sledgehammer_ without 25 + _h_!</xeblog-conv> 26 + 27 + If none of this is making any sense, please read [the rest of the 28 + series](https://xeiaso.net/blog/series/h). This will hopefully help something 29 + make sense. 30 + 31 + <xeblog-conv name="Numa" mood="delet">If you need even more context, check [this 32 + page](https://pkg.go.dev/context) for more information.</xeblog-conv> 33 + 34 + There was one major flaw with hlang in the past though. It was a hollow shell of 35 + itself and had rot to the slains and arrows of time. The playground stopped 36 + working, so people could not understand the sheer might of hlang by playing with 37 + it. 38 + 39 + Lo, behold, a new compiler was born. In this article, I will describe the nguh 40 + compiler and how it revolutionizes the ways that you use hlang for both 41 + professional and personal uses. 42 + 43 + <xeblog-conv name="Mara" mood="wat">Wait, what, there _were_ professional users 44 + of hlang???</xeblog-conv> 45 + 46 + <xeblog-conv name="Numa" mood="delet">Having 2 years of hlang on your resume 47 + will let you get hired by Google!</xeblog-conv> 48 + 49 + ## The Old Compiler 50 + 51 + The old compiler was a HACK. The main way it worked was by feeding the program 52 + source code as a string to this [Go template](https://pkg.go.dev/text/template): 53 + 54 + ``` 55 + (module 56 + (import "h" "h" (func $h (param i32))) 57 + (func $h_main 58 + (local i32 i32 i32) 59 + (local.set 0 (i32.const 10)) 60 + (local.set 1 (i32.const 104)) 61 + (local.set 2 (i32.const 39)) 62 + {{ range . -}} 63 + {{ if eq . 32 -}} 64 + (call $h (get_local 0)) 65 + {{ end -}} 66 + {{ if eq . 104 -}} 67 + (call $h (get_local 1)) 68 + {{ end -}} 69 + {{ if eq . 39 -}} 70 + (call $h (get_local 2)) 71 + {{ end -}} 72 + {{ end -}} 73 + (call $h (get_local 0)) 74 + ) 75 + (export "h" (func $h_main)) 76 + ) 77 + ``` 78 + 79 + This template worked by taking the program input _as a string_ and looping over 80 + each character to decide what to do. If it was a space, it would print a 81 + newline. If it was an `h`, it would print `h`. If it was a `'`, it would print a 82 + `'`. Anything else is ignored. 83 + 84 + However, this means that the parser was mostly ignored. And the parser spec 85 + compiles to 117 bytes when gzipped, which means that it can fit on a tshirt. 86 + 87 + <xeblog-conv name="Numa" mood="delet">That's a savings of 0.8475%!</xeblog-conv> 88 + 89 + Additionally, this would then use the command 90 + [`wat2wasm`](https://developer.mozilla.org/en-US/docs/WebAssembly/Text_format_to_wasm) 91 + to compile it to a WebAssembly file instead of doing it directly. This combined 92 + with the fact that the `get_local` instruction was renamed to `local.get` in the 93 + text format some time in the last 2 years means that not only was my compiler 94 + hacky, it didn't work anymore. 95 + 96 + <xeblog-conv name="Mara" mood="hacker">Apparently that was renamed before WASM 97 + hit 1.0 and the legacy name was an alias they planned to remove. Guess who 98 + didn't get the memo!</xeblog-conv> 99 + 100 + Needless to say, this could be fixed by doing a simple 101 + `s/get_local/local\.get/g` on the source file, but that's not fun. You know 102 + what's really fun? Reverse-engineering a binary file on stream and reassembling 103 + an identical replica in code. That's fun. 104 + 105 + ## The nguh compiler 106 + 107 + On December 31st, 2022, I wrote the nguh compiler [on 108 + stream](https://twitch.tv/princessxen). The nguh (nguh gives u hlang or 109 + Next-Generation Universal Hlang compiler, whichever you prefer) compiler outputs 110 + WebAssembly bytecode directly instead of using `wat2wasm` as a middleman. 111 + 112 + <xeblog-conv name="Mara" mood="happy">This means that hlang has even fewer 113 + dependencies!</xeblog-conv> 114 + 115 + nguh is supposed to be pronounced with the final sound of `-ing` and `uh` 116 + smashed together. It is not phonetically valid in English. It will take some 117 + practice to say it correctly. I'm not sorry. If you can read IPA, it's 118 + pronounced /ŋə/. The name comes from the youtuber [Agma 119 + Schwa](https://www.youtube.com/@AgmaSchwa)'s show about conlangs named /ŋə/. 120 + 121 + To help you understand the architecture of nguh, it will be helpful to get some 122 + context about how WebAssembly files work. 123 + 124 + ## How WebAssembly files work 125 + 126 + <details> 127 + <summary>What is WebAssembly?</summary> 128 + 129 + WebAssembly is a standard that specifies a way to run programs on arbitrary 130 + hardware in a sandboxed way. It is used mainly in web browsers to power things 131 + like YouTube's player component, Twitch stream viewing, and by developers any 132 + time they need to put a block of code into a website without having to rewrite 133 + it in JavaScript. 134 + 135 + I'm part of a slowly growing group of developers that want to run WebAssembly 136 + code on the server so that you can take the same `.wasm` file and run it on any 137 + hardware without having to have the source code and a working compiler setup. 138 + 139 + hlang is compiled to WebAssembly for no reason in particular. 140 + </details> 141 + 142 + At a high level, a WebAssembly module has a bunch of sections in it. Each 143 + section contains information for things like what functions the module exports, 144 + the types of imported fuctions, how much memory the module needs, what should be 145 + in memory by default, and the function bodies for your code. Here's an annotated 146 + disassembly of a hlang binary: 147 + 148 + ``` 149 + 0x00, 0x61, 0x73, 0x6d, // \0asm wasm magic number 150 + 0x01, 0x00, 0x00, 0x00, // version 1 151 + 152 + 0x01, // type section 153 + 0x08, // 8 bytes long 154 + 0x02, // 2 entries 155 + 0x60, 0x01, 0x7f, 0x00, // function type 0, 1 i32 param, 0 return 156 + 0x60, 0x00, 0x00, // function type 1, 0 param, 0 return 157 + 158 + 0x02, // import section 159 + 0x07, // 7 bytes long 160 + 0x01, // 1 entry 161 + 0x01, 0x68, // module h 162 + 0x01, 0x68, // name h 163 + 0x00, // type index 164 + 0x00, // function number 165 + 166 + 0x03, // func section 167 + 0x02, // 2 bytes long 168 + 0x01, // function 1 169 + 0x01, // type 1 170 + 171 + 0x07, // export section 172 + 0x05, // 5 bytes long 173 + 0x01, // 1 entry 174 + 0x01, 0x68, // "h" 175 + 0x00, 0x01, // function 1 176 + 177 + 0x0a, // code section 178 + 0x1b, // 27 bytes long 179 + 0x01, // 1 entry 180 + 0x19, // 25 bytes long 181 + 0x01, // 1 local declaration 182 + 0x03, 0x7f, // 3 i32 values - (local i32 i32 i32) 183 + 0x41, 0x0a, // i32.const 10 (newline) 184 + 0x21, 0x00, // local.set 0 185 + 0x41, 0xe8, 0x00, // i32.const 104 (h) 186 + 0x21, 0x01, // local.set 1 187 + 0x41, 0x27, // i32.const 39 (') 188 + 0x21, 0x02, // local.set 2 189 + 0x20, 0x01, // local.get 1 push h 190 + 0x10, 0x00, // call 0 (putchar) 191 + 0x20, 0x00, // local.get 0 push newline 192 + 0x10, 0x00, // call 0 (putchar) 193 + 0x0b // end of function 194 + ``` 195 + 196 + At a high level, nguh just takes all the needed sections and [puts them in the 197 + target 198 + binary](https://github.com/Xe/x/blob/2fe527950512b97a544d2d59539026514ad59544/cmd/hlang/nguh/compile.go#L53). 199 + Most of the sections are copied verbatim from that disassembly I pasted above 200 + because they don't need any modification for the binary to work. 201 + 202 + The exciting part happens when the individual nodes in the hlang syntax tree get 203 + compiled to WebAssembly bytecode. Each node in the tree has maybe its character 204 + to print and maybe a list of child nodes. A syntax tree for hlang could look 205 + like this if it has one character in the program: 206 + 207 + ``` 208 + input: h 209 + H("h") 210 + ``` 211 + 212 + Or it could look like this if there are multiple characters in the program: 213 + 214 + ``` 215 + input: h h h 216 + H{ 217 + "h", 218 + "h", 219 + "h", 220 + } 221 + ``` 222 + 223 + This means I need something like this: 224 + 225 + ```go 226 + // compile AST to wasm 227 + if len(tree.Kids) == 0 { 228 + if err := compileOneNode(funcBuf, tree); err != nil { 229 + return nil, err 230 + } 231 + } else { 232 + for _, node := range tree.Kids { 233 + if err := compileOneNode(funcBuf, node); err != nil { 234 + return nil, err 235 + } 236 + } 237 + } 238 + ``` 239 + 240 + This will either read from the root of the tree or all of the tree's children in 241 + order to compile the entire program. The `compileOneNode` function will turn the 242 + text associated with the node into the correlating WASM bytecode (pushing the 243 + relevant character to the stack and then calling the `h.h` (`putchar`) function). 244 + 245 + Finally it will generate the end of the function including a trailing newline 246 + and end the `.wasm` file. 247 + 248 + <xeblog-conv name="Mara" mood="hacker">Fun fact: the generated binary for a 249 + hlang program that only prints `h` is 69 bytes.</xeblog-conv> 250 + 251 + <xeblog-conv name="Numa" mood="delet">NICE!</xeblog-conv> 252 + 253 + Here is a base-64 encoded hlang binary in case you find this interesting: 254 + 255 + ``` 256 + AGFzbQEAAAABCAJgAX8AYAAAAgcBAWgBaAAAAwIB 257 + AQcFAQFoAAEKHQEbAQN/QQohAEHoACEBQSchAiAB 258 + EAAgABAAAQEL 259 + ``` 260 + 261 + --- 262 + 263 + If you want to play with hlang, head to its new home at 264 + [h.within.lgbt](https://h.within.lgbt). If you want to witness things such as 265 + this being created live, follow me [on twitch](https://twitch.tv/princessxen) or 266 + on my VTuber business account at [@xe@vt.social](https://vt.social/@xe). 267 + 268 + <xeblog-conv name="Cadey" mood="enby">Happy new year to those that 269 + celebrate!</xeblog-conv>
+1
blog/the-origin-of-h-2015-12-14.markdown
··· 1 1 --- 2 2 title: The Origin of h 3 3 date: 2015-12-14 4 + series: h 4 5 --- 5 6 6 7 NOTE: There is a [second part](https://xeiaso.net/blog/formal-grammar-of-h-2019-05-19) to this article now with a formal grammar.
+5
dhall/package.dhall
··· 23 23 , title = "Aura" 24 24 , description = "PonyvilleFM live DJ recording bot" 25 25 } 26 + , Link::{ 27 + , url = "https://h.within.lgbt" 28 + , title = "The h Programming Language" 29 + , description = "An esoteric programming language that compiles to WebAssembly" 30 + } 26 31 , Link::{ 27 32 , url = "https://github.com/Xe/olin" 28 33 , title = "Olin"