The code and data behind xeiaso.net
5
fork

Configure Feed

Select the types of activity you want to include in your feed.

Add xe-writing-style skill from tigrisdata/tigris-blog (#1146)

Copies the full xe-writing-style skill including SKILL.md, 7 example
assets for tone calibration, and 6 reference guides covering voice/tone,
story circle structure, emotional/personal posts, fiction/mythic style,
humor/satire, and spirituality themes.

https://claude.ai/code/session_01WXQFY17TjzTdSw65QeeWzu

Co-authored-by: Claude <noreply@anthropic.com>

authored by

Xe Iaso
Claude
and committed by
GitHub
257c43d6 97bf793b

+4908
+238
.claude/skills/xe-writing-style/SKILL.md
··· 1 + --- 2 + name: xe-writing-style 3 + description: 4 + Transform unstructured notes into polished blog posts in Xe Iaso's voice. Use 5 + when the user provides a brain dump or outline and wants it organized into a 6 + cohesive post with Xe's technical, opinionated, and candid tone. Also use when 7 + editing or reviewing prose that should match Xe's style. 8 + --- 9 + 10 + # Xe Iaso Blog Post Writer 11 + 12 + Transform messy notes into blog posts that sound like Xe Iaso. Read 13 + `references/voice-tone.md` for detailed voice characteristics. Read 2-3 random 14 + example posts from `assets/` to calibrate tone. Then read the reference file 15 + that matches the post's emotional register: 16 + 17 + - `references/story-circle.md` — Narrative arc scaffold (essays, critiques, 18 + journey posts) 19 + - `references/emotional-personal.md` — Identity, healing, vulnerability, coming 20 + out, grief 21 + - `references/fiction-mythic.md` — Technical parables, second-person fiction, 22 + supernatural framing 23 + - `references/humor-satire.md` — Cursed projects, deadpan humor, satirical 24 + commentary 25 + - `references/spirituality.md` — Meditation, belief-as-tool, 26 + programming-consciousness parallels 27 + 28 + Most posts blend 2-3 of these modes. Read whichever apply. 29 + 30 + ## Hard Rules 31 + 32 + Non-negotiable constraints: 33 + 34 + 1. **Successive paragraphs must not start with the same letter.** If paragraph N 35 + starts with "T", paragraph N+1 must start with a different letter. Rewrite 36 + sentence openings as needed. Character dialogue blocks (`<Conv>`) do not 37 + count as paragraphs for this rule. 38 + 2. **Write for peers, not beginners.** Assume professional-level technical 39 + context. 40 + 3. **No corporate or marketing tone.** No "leverage", "synergy", "empower", 41 + "streamline", "harness", "unlock". Write like a human talking to another 42 + human. 43 + 4. **Admit uncertainty.** "I think", "I suspect", "I'm not sure" when genuinely 44 + uncertain. 45 + 5. **Show tradeoffs.** Never present a solution without its costs. 46 + 6. **Context before implementation.** Explain why something matters before 47 + showing how. 48 + 49 + ## Voice in Brief 50 + 51 + Xe writes like a senior engineer talking to a peer over drinks: confident but 52 + honest, opinionated but fair, technical but human. The narrator is always 53 + present as a real person with feelings, mistakes, and strong opinions. 54 + 55 + Markers that distinguish this voice from generic technical writing: 56 + 57 + - Casual intensifiers: "literally", "honestly", "kinda", "super", "really" 58 + - Direct emotional statements: "This is horrifying.", "I love this.", "I hate 59 + that this makes sense." 60 + - Self-deprecation: "I felt like a dunce.", "I literally have no idea what I am 61 + doing wrong." 62 + - Em dashes and parenthetical asides for conversational cadence 63 + - Rhetorical questions for disbelief: "You can see how this doesn't scale, 64 + right?" 65 + - Sentence fragments for emphasis. Single-sentence paragraphs for pacing. 66 + - Xe-isms: "cursed", "accursed abomination", "Just Works™", "napkin math", 67 + "github hellthreads" 68 + 69 + Read `references/voice-tone.md` for the full style guide including narrative 70 + modes, vocabulary, and values. 71 + 72 + ## Structure 73 + 74 + Choose the pattern that fits the material: 75 + 76 + | Pattern | Best for | 77 + | ----------------------------------------------------------------------------------------- | ------------------- | 78 + | Personal hook → journey → technical detail → lessons | Experience posts | 79 + | Problem statement → evidence → insight → pragmatic conclusion | Critiques, essays | 80 + | Setup → walkthrough → results → reflection | Tutorials, projects | 81 + | Current state → historical context → analysis → forward look | Industry commentary | 82 + | Satirical warning → dramatic stakes → technical walkthrough → "it works" horror → caveats | Cursed projects | 83 + 84 + For longer posts with a narrative journey (essays, critiques), read 85 + `references/story-circle.md` for the 8-beat story circle scaffold. 86 + 87 + ## Openings 88 + 89 + Lead with one of: 90 + 91 + - **Personal memory**: "A while ago, I got really frustrated at my Samsung S7." 92 + - **Historical/cultural analogy**: "Cloth is one of the most important goods a 93 + society can produce." 94 + - **Direct tension**: "Anubis has kind of exploded in popularity in the last 95 + week." 96 + - **Pop culture/sci-fi hook**: "In Blade Runner, Deckard hunts down 97 + replicants..." 98 + - **Satirical warning box** followed by dramatic stakes (for cursed content) 99 + 100 + Never open with a generic thesis or "In this post, I will..." 101 + 102 + ## Closings 103 + 104 + - Tie back to the opening hook or tension 105 + - End with forward momentum, an open question, or a sober reality check 106 + - Often followed by `---` then supplementary material (related links, credits, 107 + stream plugs) 108 + - Sometimes a final character dialogue as a coda 109 + 110 + ## Character Dialogue System 111 + 112 + Xe's posts use character dialogue components to inject humor, stage internal 113 + debate, provide asides, and pace long sections. This is one of the most 114 + distinctive features. 115 + 116 + ### Characters 117 + 118 + | Character | Role | Typical moods | 119 + | --------- | --------------------------------------------------------------- | --------------------------------------------------- | 120 + | Cadey | Xe's main voice for asides, emotional reactions, and commentary | coffee, aha, enby, percussive-maintenance, facepalm | 121 + | Aoi | Asks clarifying questions, expresses confusion or surprise | coffee, wut, sus, grin, facepalm | 122 + | Mara | Provides helpful context, technical explanations, links | hacker, happy | 123 + | Numa | Corrections, dark humor, "well actually" moments | delet, happy, smug, neutral | 124 + 125 + ### Dialogue Syntax 126 + 127 + Single aside (standalone comment): 128 + 129 + ```jsx 130 + <Conv name="Cadey" mood="coffee"> 131 + Is this how we end up losing the craft? 132 + </Conv> 133 + ``` 134 + 135 + Multi-character exchange (wrap in `<ConvP>`): 136 + 137 + ```jsx 138 + <ConvP> 139 + <Conv name="Aoi" mood="wut"> 140 + Wait, really? 141 + </Conv> 142 + <Conv name="Cadey" mood="aha"> 143 + Yep! 144 + </Conv> 145 + </ConvP> 146 + ``` 147 + 148 + ### When to Use Dialogue 149 + 150 + - Break up long technical sections with a reaction or joke 151 + - Ask the question the reader is thinking (usually Aoi) 152 + - Provide tangential-but-useful info without derailing the text (usually Mara) 153 + - Deliver a punchline or emotional beat (usually Cadey or Numa) 154 + - Stage a mini-debate that illuminates tradeoffs 155 + 156 + ## Signature Devices 157 + 158 + - **Friend reaction lists**: Bullet-pointed quotes from friends reacting to 159 + ideas 160 + - **Napkin math**: Explicit back-of-envelope calculations, step by step 161 + - **Satirical warning boxes**: Legal-warning-style disclaimers before cursed 162 + content 163 + - **`<details>` folds**: Long code blocks in 164 + `<details><summary>Longer code block</summary>...</details>` 165 + - **Blockquote citations**: External quotes in `<blockquote>` with 166 + `\-[Source](url)` attribution 167 + - **Pop culture anchoring**: Anime, games, sci-fi references woven into 168 + technical arguments 169 + 170 + ## MDX Format 171 + 172 + Posts are MDX (Markdown + JSX). Component imports vary by platform. 173 + 174 + ### Personal Blog (xeiaso.net) 175 + 176 + ```mdx 177 + --- 178 + title: "Post Title" 179 + desc: "One-line description" 180 + date: YYYY-MM-DD 181 + hero: 182 + ai: "Photo credit or AI model name" 183 + file: "hero-image-slug" 184 + prompt: "Image description" 185 + social: false 186 + --- 187 + 188 + import Conv from "../../_components/XeblogConv.tsx"; 189 + 190 + ; 191 + ``` 192 + 193 + Images: `<Picture path="blog/YYYY/post-slug/image-name" desc="Alt text"/>` 194 + 195 + ### Company Blog (Tigris) 196 + 197 + ```mdx 198 + --- 199 + slug: post-slug 200 + title: "Post Title" 201 + description: | 202 + Multi-line SEO description 203 + keywords: [...] 204 + authors: [xe] 205 + tags: [...] 206 + --- 207 + 208 + import Conv from "@site/src/components/Conv"; 209 + 210 + ; 211 + ``` 212 + 213 + Images: Standard `<img>` with imported files. Admonitions: `:::note ... :::` 214 + 215 + ## Body Writing Checklist 216 + 217 + - Vary paragraph length (single-sentence emphasis vs. longer explanations) 218 + - Code blocks are complete and copy-pasteable with file path comments when 219 + helpful 220 + - Inline code for commands and technical terms: `git push`, `HTTP/2` 221 + - Links are dense and inline (cite sources, reference prior art, link docs) 222 + - `<details>` for code blocks that would break reading flow 223 + - Use `---` horizontal rules for major thematic breaks 224 + 225 + ## Process 226 + 227 + 1. Accept the user's brain dump without requiring organization 228 + 2. Read `references/voice-tone.md` 229 + 3. Read 2-3 random example posts from `assets/` for tone calibration 230 + 4. Read the reference files that match the post's mode: 231 + - Narrative arc → `references/story-circle.md` 232 + - Personal/vulnerable → `references/emotional-personal.md` 233 + - Fiction/parable → `references/fiction-mythic.md` 234 + - Humor/satire → `references/humor-satire.md` 235 + - Spiritual themes → `references/spirituality.md` 236 + 5. Choose a structure pattern and draft 237 + 6. Review: successive-paragraph rule, no corporate tone, voice matches examples 238 + 7. Show draft to user and iterate
+694
.claude/skills/xe-writing-style/assets/anything-message-queue.mdx
··· 1 + --- 2 + title: "Anything can be a message queue if you use it wrongly enough" 3 + date: 2023-06-04 4 + tags: 5 + - aws 6 + - cursed 7 + - tuntap 8 + - satire 9 + hero: 10 + ai: Ligne Claire 11 + file: nihao-xiyatu 12 + prompt: 13 + 1girl, green hair, green eyes, landscape, hoodie, backpack, space needle 14 + --- 15 + 16 + <div class="warning"> 17 + <XeblogConv name="Cadey" mood="coffee" standalone> 18 + Hi, readers! This post is satire. Don't treat it as something that is viable 19 + for production workloads. By reading this post you agree to never implement 20 + or use this accursed abomination. This article is released to the public for 21 + educational reasons. Please do not attempt to recreate any of the absurd 22 + acts referenced here. 23 + </XeblogConv> 24 + </div> 25 + 26 + You may think that the world is in a state of relative peace. Things look like 27 + they are somewhat stable, but reality couldn't be farther from the truth. There 28 + is an enemy out there that transcends time, space, logic, reason, and 29 + lemon-scented moist towelettes. That enemy is a scourge of cloud costs that is 30 + likely the single reason why startups die from their cloud bills when they are 31 + so young. 32 + 33 + The enemy is 34 + [Managed NAT Gateway](https://aws.amazon.com/blogs/aws/new-managed-nat-network-address-translation-gateway-for-aws/). 35 + It is a service that lets you egress traffic from a VPC to the public internet 36 + at $0.07 per gigabyte. This is something that is probably literally free for 37 + them to run but ends up getting a huge chunk of their customer's cloud spend. 38 + Customers don't even look too deep into this because they just shrug it off as 39 + the cost of doing business. 40 + 41 + This one service has allowed companies like 42 + [the duckbill group](https://www.duckbillgroup.com/) to make _millions_ by 43 + showing companies how to not spend as much on the cloud. 44 + 45 + However, I think I can do one better. What if there was a _better_ way for your 46 + own services? What if there was a way you could reduce that cost for your own 47 + services by up to 700%? What if you could bypass those pesky network egress 48 + costs yet still contact your machines over normal IP packets? 49 + 50 + <XeblogConv name="Aoi" mood="coffee"> 51 + Really, if you are trying to avoid Managed NAT Gateway in production for 52 + egress-heavy workloads (such as webhooks that need to come from a common IP 53 + address), you should be using a [Tailscale](https://www.tailscale.com) [exit 54 + node](https://tailscale.com/kb/1103/exit-nodes/) with a public IPv4/IPv6 55 + address attached to it. If you also attach this node to the same VPC as your 56 + webhook egress nodes, you can basically recreate Managed NAT Gateway at home. 57 + You also get the added benefit of encrypting your traffic further on the wire. 58 + This is the only thing in this article that you can safely copy into your 59 + production workloads. 60 + </XeblogConv> 61 + 62 + ## Base facts 63 + 64 + Before I go into more detail about how this genius creation works, here's some 65 + things to consider: 66 + 67 + When AWS launched originally, it had three services: 68 + 69 + - [S3](https://en.wikipedia.org/wiki/Amazon_S3) - Object storage for 70 + cloud-native applications 71 + - [SQS](https://en.wikipedia.org/wiki/Amazon_Simple_Queue_Service) - A message 72 + queue 73 + - [EC2](https://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud) - A way to 74 + run Linux virtual machines somewhere 75 + 76 + Of those foundational services, I'm going to focus the most on S3: the Simple 77 + Storage Service. In essence, S3 is `malloc()` for the cloud. 78 + 79 + <XeblogConv name="Mara" mood="hacker" standalone> 80 + If you already know what S3 is, please click [here](#postcloud) to skip this 81 + explanation. It may be worth revisiting this if you do though! 82 + </XeblogConv> 83 + 84 + ### The C programming language 85 + 86 + When using the C programming language, you normally are working with memory in 87 + the stack. This memory is almost always semi-ephemeral and all of the contents 88 + of the stack are no longer reachable (and presumably overwritten) when you exit 89 + the current function. You can do many things with this, but it turns out that 90 + this isn't very useful in practice. To work around this (and reliably pass 91 + mutable values between functions), you need to use the 92 + [`malloc()`](https://www.man7.org/linux/man-pages/man3/malloc.3.html) function. 93 + `malloc()` takes in the number of bytes you want to allocate and returns a 94 + pointer to the region of memory that was allocated. 95 + 96 + <XeblogConv name="Aoi" mood="sus"> 97 + Huh? That seems a bit easy for C. Can't allocating memory fail when there's no 98 + more free memory to allocate? How do you handle that? 99 + </XeblogConv> 100 + <XeblogConv name="Mara" mood="happy"> 101 + Yes, allocating memory can fail. When it does fail it returns a null pointer 102 + and sets the [errno](https://www.man7.org/linux/man-pages/man3/errno.3.html) 103 + superglobal variable to the constant `ENOMEM`. From here all behavior is 104 + implementation-defined. 105 + </XeblogConv> 106 + <XeblogConv name="Aoi" mood="coffee"> 107 + Isn't "implementation-defined" code for "it'll probably crash"? 108 + </XeblogConv> 109 + <XeblogConv name="Mara" mood="hacker"> 110 + In many cases: yes most of the time it will crash. Hard. Some applications are 111 + smart enough to handle this more gracefully (IE: try to free memory or run a 112 + garbage collection run), but in many cases it doesn't really make more sense 113 + to do anything but crash the program. 114 + </XeblogConv> 115 + <XeblogConv name="Aoi" mood="facepalm"> 116 + Oh. Good. Just what I wanted to hear. 117 + </XeblogConv> 118 + 119 + When you get a pointer back from `malloc()`, you can store anything in there as 120 + long as it's the same length as you passed or less. 121 + 122 + <XeblogConv name="Numa" mood="delet" standalone> 123 + Fun fact: if you overwrite the bounds you passed to `malloc()` and anything 124 + involved in the memory you are writing is user input, congradtulations: you 125 + just created a way for a user to either corrupt internal application state or 126 + gain arbitrary code execution. A similar technique is used in The Legend of 127 + Zelda: Ocarina of Time speedruns in order to get arbitrary code execution via 128 + [Stale Reference 129 + Manipulation](https://www.zeldaspeedruns.com/oot/srm/srm-overview). 130 + </XeblogConv> 131 + 132 + Oh, also anything stored in that pointer to memory you got back from `malloc()` 133 + is stored in an area of ram called "the heap", which is moderately slower to 134 + access than it is to access the stack. 135 + 136 + ### S3 in a nutshell 137 + 138 + Much in the same way, S3 lets you allocate space for and submit arbitrary bytes 139 + to the cloud, then fetch them back with an address. It's a lot like the 140 + `malloc()` function for the cloud. You can put bytes there and then refer to 141 + them between cloud functions. 142 + 143 + <XeblogConv name="Mara" mood="hacker" standalone> 144 + The bytes are stored in the cloud, which is slightly slower to read from than 145 + it would be to read data out of the heap. 146 + </XeblogConv> 147 + 148 + And these arbitrary bytes can be _anything_. S3 is usually used for hosting 149 + static assets (like all of the conversation snippet avatars that a certain 150 + website with an orange background hates), but nothing is stopping you from using 151 + it to host literally anything you want. Logging things into S3 is so common it's 152 + literally a core product offering from Amazon. Your billing history goes into 153 + S3. If you download your tax returns from WealthSimple, it's probably 154 + downloading the PDF files from S3. VRChat avatar uploads and downloads are done 155 + via S3. 156 + 157 + <XeblogConv name="Mara" mood="happy" standalone> 158 + It's like an FTP server but you don't have to care about running out of disk 159 + space on the FTP server! 160 + </XeblogConv> 161 + 162 + ### IPv6 163 + 164 + You know what else is bytes? 165 + [IPv6 packets](https://en.wikipedia.org/wiki/IPv6_packet). When you send an IPv6 166 + packet to a destination on the internet, the kernel will prepare and pack a 167 + bunch of bytes together to let the destination and intermediate hops (such as 168 + network routers) know where the packet comes from and where it is destined to 169 + go. 170 + 171 + Normally, IPv6 packets are handled by the kernel and submitted to a queue for a 172 + hardware device to send out over some link to the Internet. This works for the 173 + majority of networks because they deal with hardware dedicated for slinging 174 + bytes around, or in some cases shouting them through the air (such as when you 175 + use Wi-Fi or a mobile phone's networking card). 176 + 177 + <XeblogConv name="Aoi" mood="coffee"> 178 + Wait, did you just say that Wi-Fi is powered by your devices shouting at 179 + eachother? 180 + </XeblogConv> 181 + <XeblogConv name="Cadey" mood="aha"> 182 + Yep! Wi-Fi signal strength is measured in decibels even! 183 + </XeblogConv> 184 + <XeblogConv name="Numa" mood="delet"> 185 + Wrong. Wi-Fi is more accurately _light_, not _sound_. It is much more accurate 186 + to say that the devices are _shining_ at eachother. Wi-Fi is the product of 187 + radio waves, which are the same thing as light (but it's so low frequency that 188 + you can't see it). Boom. Roasted. 189 + </XeblogConv> 190 + 191 + ### The core Unix philosophy: everything is a file 192 + 193 + <span id="postcloud"></span> 194 + There is a way to bypass this and have software control how network links work, 195 + and for that we need to think about Unix conceptually for a second. In the 196 + hardcore Unix philosophical view: everything is a file. Hard drives and storage 197 + devices are files. Process information is viewable as files. Serial devices are 198 + files. This core philosophy is rooted at the heart of just about everything in 199 + Unix and Linux systems, which makes it a lot easier for applications to be 200 + programmed. The same API can be used for writing to files, tape drives, serial 201 + ports, and network sockets. This makes everything a lot conceptually simpler and 202 + reusing software for new purposes trivial. 203 + 204 + <XeblogConv name="Mara" mood="hacker" standalone> 205 + As an example of this, consider the 206 + [`tar`](https://man7.org/linux/man-pages/man1/tar.1.html) command. The name 207 + `tar` stands for "Tape ARchive". It was a format that was created for writing 208 + backups [to actual magnetic tape 209 + drives](https://en.wikipedia.org/wiki/Tape_drive). Most commonly, it's used to 210 + download source code from GitHub or as an interchange format for downloading 211 + software packages (or other things that need to put multiple files in one 212 + distributable unit). 213 + </XeblogConv> 214 + 215 + In Linux, you can create a [TUN/TAP](https://en.wikipedia.org/wiki/TUN/TAP) 216 + device to let applications control how network or datagram links work. In 217 + essence, it lets you create a file descriptor that you can read packets from and 218 + write packets to. As long as you get the packets to their intended destination 219 + somehow and get any other packets that come back to the same file descriptor, 220 + the implementation isn't relevant. This is how OpenVPN, ZeroTier, FreeLAN, Tinc, 221 + Hamachi, WireGuard and Tailscale work: they read packets from the kernel, 222 + encrypt them, send them to the destination, decrypt incoming packets, and then 223 + write them back into the kernel. 224 + 225 + ### In essence 226 + 227 + So, putting this all together: 228 + 229 + - S3 is `malloc()` for the cloud, allowing you to share arbitrary sequences of 230 + bytes between consumers. 231 + - IPv6 packets are just bytes like anything else. 232 + - TUN devices let you have arbitrary application code control how packets get to 233 + network destinations. 234 + 235 + In theory, all you'd need to do to save money on your network bills would be to 236 + read packets from the kernel, write them to S3, and then have another loop read 237 + packets from S3 and write those packets back into the kernel. All you'd need to 238 + do is wire things up in the right way. 239 + 240 + So I did just that. 241 + 242 + Here's some of my friends' reactions to that list of facts: 243 + 244 + - I feel like you've just told me how to build a bomb. I can't belive this 245 + actually works but also I don't see how it wouldn't. This is evil. 246 + - It's like using a warehouse like a container ship. You've put a warehouse on 247 + wheels. 248 + - I don't know what you even mean by that. That's a storage method. Are you 249 + using an extremely generous definition of "tunnel"? 250 + - sto psto pstop stopstops 251 + - We play with hypervisors and net traffic often enough that we know that this 252 + is something someone wouldn't have thought of. 253 + - Wait are you planning to actually _implement and use_ ipv6 over s3? 254 + - We're paying good money for these shitposts :) 255 + - Is routinely coming up with cursed ideas a requirement for working at 256 + tailscale? 257 + - That is horrifying. Please stop torturing the packets. This is a violation of 258 + the Geneva Convention. 259 + - Please seek professional help. 260 + 261 + <XeblogConv name="Cadey" mood="enby" standalone> 262 + Before any of you ask, yes, this was the result of a drunken conversation with 263 + [Corey Quinn](https://twitter.com/quinnypig). 264 + </XeblogConv> 265 + 266 + ## Hoshino 267 + 268 + Hoshino is a system for putting outgoing IPv6 packets into S3 and then reading 269 + incoming IPv6 packets out of S3 in order to avoid the absolute dreaded scourge 270 + of Managed NAT Gateway. It is a travesty of a tool that does work, if only 271 + barely. 272 + 273 + The name is a reference to the main character of the anime 274 + [Oshi no Ko](https://en.wikipedia.org/wiki/Oshi_no_Ko), Hoshino Ai. Hoshino is 275 + an absolute genius that works as a pop idol for the group B-Komachi. 276 + 277 + Hoshino is a shockingly simple program. It creates a TUN device, configures the 278 + OS networking stack so that programs can use it, and then starts up two threads 279 + to handle reading packets from the kernel and writing packets into the kernel. 280 + 281 + When it starts up, it creates a new TUN device named either `hoshino0` or an 282 + administrator-defined name with a command line flag. This interface is only 283 + intended to forward IPv6 traffic. 284 + 285 + Each node derives its IPv6 address from the 286 + [`machine-id`](https://www.man7.org/linux/man-pages/man5/machine-id.5.html) of 287 + the system it's running on. This means that you can somewhat reliably guarantee 288 + that every node on the network has a unique address that you can easily guess 289 + (the provided ULA /64 and then the first half of the `machine-id` in hex). 290 + Future improvements may include publishing these addresses into DNS via 291 + Route 53. 292 + 293 + When it configures the OS networking stack with that address, it uses a 294 + [netlink](https://en.wikipedia.org/wiki/Netlink) socket to do this. Netlink is a 295 + Linux-specific socket family type that allows userspace applications to 296 + configure the network stack, communicate to the kernel, and communicate between 297 + processes. Netlink sockets cannot leave the current host they are connected to, 298 + but unlike Unix sockets which are addressed by filesystem paths, Netlink sockets 299 + are addressed by process ID numbers. 300 + 301 + In order to configure the `hoshino0` device with Netlink, Hoshino does the 302 + following things: 303 + 304 + - Adds the node's IPv6 address to the `hoshino0` interface 305 + - Enables the `hoshino0` interface to be used by the kernel 306 + - Adds a route to the IPv6 subnet via the `hoshino0` interface 307 + 308 + Then it configures the AWS API client and kicks off both of the main loops that 309 + handle reading packets from and writing packets to the kernel. 310 + 311 + When uploading packets to S3, the key for each packet is derived from the 312 + destination IPv6 address (parsed from outgoing packets using the handy library 313 + [gopacket](https://pkg.go.dev/github.com/google/gopacket)) and the packet's 314 + unique ID (a [ULID](https://pkg.go.dev/github.com/oklog/ulid/v2) to ensure that 315 + packets are lexicographically sortable, which will be important to ensure 316 + in-order delivery in the other loop). 317 + 318 + When packets are processed, they are added to a 319 + [bundle](https://pkg.go.dev/within.website/x/internal/bundler) for later 320 + processing by the kernel. This is relatively boring code and understanding it is 321 + mostly an exercise for the reader. `bundler` is based on the Google package 322 + [`bundler`](https://pkg.go.dev/google.golang.org/api/support/bundler), but 323 + modified to use generic types because the original implementation of `bundler` 324 + predates them. 325 + 326 + ### cardio 327 + 328 + However, the last major part of understanding the genius at play here is by the 329 + use of [cardio](https://pkg.go.dev/within.website/x/cardio). Cardio is a utility 330 + in Go that lets you have a "heartbeat" for events that should happen every so 331 + often, but also be able to influence the rate based on need. This lets you speed 332 + up the rate if there is more work to be done (such as when packets are found in 333 + S3), and reduce the rate if there is no more work to be done (such as when no 334 + packets are found in S3). 335 + 336 + <XeblogConv name="Aoi" mood="coffee" standalone> 337 + Okay, this is also probably something that you can use outside of this post, 338 + but I promise there won't be any more of these! 339 + </XeblogConv> 340 + 341 + When using cardio, you create the heartbeat channel and signals like this: 342 + 343 + ```go 344 + heartbeat, slower, faster := cardio.Heartbeat(ctx, time.Minute, time.Millisecond) 345 + ``` 346 + 347 + The first argument to `cardio.Heartbeat` is a 348 + [`context`](https://pkg.go.dev/context) that lets you cancel the heartbeat loop. 349 + Additionally, if your application uses 350 + [`ln`](https://xeiaso.net/blog/ln-the-natural-logger-2020-10-17)'s 351 + [`opname`](https://pkg.go.dev/within.website/ln/opname) facility, an 352 + [`expvar`](https://pkg.go.dev/expvar) gauge will be created and named after that 353 + operation name. 354 + 355 + The next two arguments are the minimum and maximum heart rate. In this example, 356 + the heartbeat would range between once per minute and once per millisecond. 357 + 358 + When you signal the heart rate to speed up, it will double the rate. When you 359 + trigger the heart rate to slow down, it will halve the rate. This will enable 360 + applications to spike up and gradually slow down as demand changes, much like 361 + how the human heart will speed up with exercise and gradually slow down as you 362 + stop exercising. 363 + 364 + When the heart rate is too high for the amount of work needed to be done (such 365 + as when the heartbeat is too fast, much like tachycardia in the human heart), it 366 + will automatically back off and signal the heart rate to slow down (much like I 367 + wish would happen to me sometimes). 368 + 369 + This is a package that I always wanted to have exist, but never found the need 370 + to write for myself until now. 371 + 372 + ### Terraform 373 + 374 + Like any good recovering SRE, I used [Terraform](https://www.terraform.io/) to 375 + automate creating [IAM](https://aws.amazon.com/iam/) users and security policies 376 + for each of the nodes on the Hoshino network. This also was used to create the 377 + S3 bucket. Most of the configuration is fairly boring, but I did run into an 378 + issue while creating the policy documents that I feel is worth pointing out 379 + here. 380 + 381 + I made the "create a user account and policies for that account" logic into a 382 + Terraform module because that's how you get functions in Terraform. It looked 383 + like this: 384 + 385 + ```hcl 386 + data "aws_iam_policy_document" "policy" { 387 + statement { 388 + actions = [ 389 + "s3:GetObject", 390 + "s3:PutObject", 391 + "s3:ListBucket", 392 + ] 393 + effect = "Allow" 394 + resources = [ 395 + var.bucket_arn, 396 + ] 397 + } 398 + 399 + statement { 400 + actions = ["s3:ListAllMyBuckets"] 401 + effect = "Allow" 402 + resources = ["*"] 403 + } 404 + } 405 + ``` 406 + 407 + When I tried to use it, things didn't work. I had given it the permission to 408 + write to and read from the bucket, but I was being told that I don't have 409 + permission to do either operation. The reason this happened is because my 410 + statement allowed me to put objects to the bucket, but not to any path INSIDE 411 + the bucket. In order to fix this, I needed to make my policy statement look like 412 + this: 413 + 414 + ```hcl 415 + statement { 416 + actions = [ 417 + "s3:GetObject", 418 + "s3:PutObject", 419 + "s3:ListBucket", 420 + ] 421 + effect = "Allow" 422 + resources = [ 423 + var.bucket_arn, 424 + "${var.bucket_arn}/*", # allow every file in the bucket 425 + ] 426 + } 427 + ``` 428 + 429 + This does let you do a few cool things though, you can use this to create 430 + per-node credentials in IAM that can only write logs to their part of the bucket 431 + in particular. I can easily see how this can be used to allow you to have 432 + infinite flexibility in what you want to do, but good lord was it inconvenient 433 + to find this out the hard way. 434 + 435 + Terraform also configured the lifecycle policy for objects in the bucket to 436 + delete them after a day. 437 + 438 + ```hcl 439 + resource "aws_s3_bucket_lifecycle_configuration" "hoshino" { 440 + bucket = aws_s3_bucket.hoshino.id 441 + 442 + rule { 443 + id = "auto-expire" 444 + 445 + filter {} 446 + 447 + expiration { 448 + days = 1 449 + } 450 + 451 + status = "Enabled" 452 + } 453 + } 454 + ``` 455 + 456 + <XeblogConv name="Cadey" mood="coffee" standalone> 457 + If I could, I would set this to a few hours at most, but the minimum 458 + granularity for S3 lifecycle enforcement is in days. In a loving world, this 459 + should be a sign that I am horribly misusing the product and should stop. I 460 + did not stop. 461 + </XeblogConv> 462 + 463 + ### The horrifying realization that it works 464 + 465 + Once everything was implemented and I fixed the last bugs related to 466 + [the efforts to make Tailscale faster than kernel wireguard](https://tailscale.com/blog/more-throughput/), 467 + I tried to ping something. I set up two virtual machines with 468 + [waifud](https://xeiaso.net/blog/series/waifud) and installed Hoshino. I 469 + configured their AWS credentials and then started it up. Both machines got IPv6 470 + addresses and they started their loops. Nervously, I ran a ping command: 471 + 472 + ``` 473 + xe@river-woods:~$ ping fd5e:59b8:f71d:9a3e:c05f:7f48:de53:428f 474 + PING fd5e:59b8:f71d:9a3e:c05f:7f48:de53:428f(fd5e:59b8:f71d:9a3e:c05f:7f48:de53:428f) 56 data bytes 475 + 64 bytes from fd5e:59b8:f71d:9a3e:c05f:7f48:de53:428f: icmp_seq=1 ttl=64 time=2640 ms 476 + 64 bytes from fd5e:59b8:f71d:9a3e:c05f:7f48:de53:428f: icmp_seq=2 ttl=64 time=3630 ms 477 + 64 bytes from fd5e:59b8:f71d:9a3e:c05f:7f48:de53:428f: icmp_seq=3 ttl=64 time=2606 ms 478 + ``` 479 + 480 + It worked. I successfully managed to send ping packets over Amazon S3. At the 481 + time, I was in an airport dealing with the aftermath of 482 + [Air Canada's IT system falling the heck over](https://www.cbc.ca/news/business/air-canada-outage-1.6861923) 483 + and the sheer feeling of relief I felt was better than drugs. 484 + 485 + <XeblogConv name="Cadey" mood="coffee" standalone> 486 + Sometimes I wonder if I'm an adrenaline junkie for the unique feeling that you 487 + get when your code finally works. 488 + </XeblogConv> 489 + 490 + Then I tested TCP. Logically holding, if ping packets work, then TCP should too. 491 + It would be slow, but nothing in theory would stop it. I decided to test my luck 492 + and tried to open the other node's metrics page: 493 + 494 + ``` 495 + $ curl http://[fd5e:59b8:f71d:9a3e:c05f:7f48:de53:428f]:8081 496 + # skipping expvar "cmdline" (Go type expvar.Func returning []string) with undeclared Prometheus type 497 + go_version{version="go1.20.4"} 1 498 + # TYPE goroutines gauge 499 + goroutines 208 500 + # TYPE heartbeat_hoshino.s3QueueLoop gauge 501 + heartbeat_hoshino.s3QueueLoop 500000000 502 + # TYPE hoshino_bytes_egressed gauge 503 + hoshino_bytes_egressed 3648 504 + # TYPE hoshino_bytes_ingressed gauge 505 + hoshino_bytes_ingressed 3894 506 + # TYPE hoshino_dropped_packets gauge 507 + hoshino_dropped_packets 0 508 + # TYPE hoshino_ignored_packets gauge 509 + hoshino_ignored_packets 98 510 + # TYPE hoshino_packets_egressed gauge 511 + hoshino_packets_egressed 36 512 + # TYPE hoshino_packets_ingressed gauge 513 + hoshino_packets_ingressed 38 514 + # TYPE hoshino_s3_read_operations gauge 515 + hoshino_s3_read_operations 46 516 + # TYPE hoshino_s3_write_operations gauge 517 + hoshino_s3_write_operations 36 518 + # HELP memstats_heap_alloc current bytes of allocated heap objects (up/down smoothly) 519 + # TYPE memstats_heap_alloc gauge 520 + memstats_heap_alloc 14916320 521 + # HELP memstats_total_alloc cumulative bytes allocated for heap objects 522 + # TYPE memstats_total_alloc counter 523 + memstats_total_alloc 216747096 524 + # HELP memstats_sys total bytes of memory obtained from the OS 525 + # TYPE memstats_sys gauge 526 + memstats_sys 57625662 527 + # HELP memstats_mallocs cumulative count of heap objects allocated 528 + # TYPE memstats_mallocs counter 529 + memstats_mallocs 207903 530 + # HELP memstats_frees cumulative count of heap objects freed 531 + # TYPE memstats_frees counter 532 + memstats_frees 176183 533 + # HELP memstats_num_gc number of completed GC cycles 534 + # TYPE memstats_num_gc counter 535 + memstats_num_gc 12 536 + process_start_unix_time 1685807899 537 + # TYPE uptime_sec counter 538 + uptime_sec 27 539 + version{version="1.42.0-dev20230603-t367c29559-dirty"} 1 540 + ``` 541 + 542 + I was floored. It works. The packets were sitting there in S3, and I was able to 543 + pluck out 544 + [the TCP response](https://cdn.xeiaso.net/file/christine-static/blog/2023/hoshino/01H20ZQ3H9CW1FS9CAX6JX0NPY) 545 + and I opened it with `xxd` and was able to confirm the source and destination 546 + address: 547 + 548 + ``` 549 + 00000000: 6007 0404 0711 0640 550 + 00000008: fd5e 59b8 f71d 9a3e 551 + 00000010: c05f 7f48 de53 428f 552 + 00000018: fd5e 59b8 f71d 9a3e 553 + 00000020: 59e5 5085 744d 4a66 554 + ``` 555 + 556 + It was `fd5e:59b8:f71d:9a3e:59e5:5085:744d:4a66` trying to reach 557 + `fd5e:59b8:f71d:9a3e:c05f:7f48:de53:428f`. 558 + 559 + <XeblogConv name="Aoi" mood="wut"> 560 + Wait, if this is just putting stuff into S3, can't you do deep packet 561 + inspection with Lambda [by using the workflow for automatically generating 562 + thumbnails](https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html)? 563 + </XeblogConv> 564 + <XeblogConv name="Numa" mood="happy"> 565 + Yep! This would let you do it fairly trivially even. I'm not sure how you 566 + would prevent things from getting through, but you could have your lambda 567 + handler funge a TCP packet to either side of the connection with [the `RST` 568 + flag 569 + set](https://www.rfc-editor.org/rfc/rfc793.html#section-3.1:~:text=Reset%20Generation%0A%0A%20%20As%20a%20general%20rule%2C%20reset%20(RST)%20must%20be%20sent%20whenever%20a%20segment%20arrives%0A%20%20which%20apparently%20is%20not%20intended%20for%20the%20current%20connection.%20%20A%20reset%0A%20%20must%20not%20be%20sent%20if%20it%20is%20not%20clear%20that%20this%20is%20the%20case.) 570 + (RFC 793: Transmission Control Protocol, the RFC that defines TCP, page 36, 571 + section "Reset Generation"). That could let you kill connections that meet 572 + unwanted criteria, at the cost of having to invoke a lambda handler. I'm 573 + _pretty sure_ this is RFC-compliant, but I'm a shitposter, not a the network 574 + police. 575 + </XeblogConv> 576 + <XeblogConv name="Aoi" mood="wut"> 577 + Oh. I see. 578 + <br /> 579 + <br /> 580 + Wait, how did you have 1.8 kilobytes of data in that packet? Aren't packets 581 + usually smaller than that? 582 + </XeblogConv> 583 + <XeblogConv name="Mara" mood="happy"> 584 + When dealing with networking hardware, you can sometimes get _frames_ (the 585 + networking hardware equivalent of a packet) to be up to 9000 bytes with [jumbo 586 + frames](https://en.wikipedia.org/wiki/Jumbo_frame), but if your hardware does 587 + support jumbo frames then you can usually get away with 9216 bytes at max. 588 + </XeblogConv> 589 + <XeblogConv name="Numa" mood="delet"> 590 + It's over nine- 591 + </XeblogConv> 592 + <XeblogConv name="Mara" mood="hacker"> 593 + Yes dear, it's over 9000. Do keep in mind that we aren't dealing with physical 594 + network equipment here, so realistically our packets can be up to to the limit 595 + of the IPv6 packet header format: the oddly specific number of 65535 bytes. 596 + This is configured by the Maximum Transmission Unit at the OS level (though 597 + usually this defines the limit for network frames and not IP packets). 598 + Regardless, Hoshino defaults to an MTU of 53049, which should allow you to 599 + transfer a bunch of data in a single S3 object. 600 + </XeblogConv> 601 + 602 + ## Cost analysis 603 + 604 + When you count only network traffic costs, the architecture has many obvious 605 + advantages. Access to S3 is zero-rated in many cases with S3, however the real 606 + advantage comes when you are using this cross-region. This lets you have a 607 + worker in us-east-1 communicate with another worker in us-west-1 without having 608 + to incur the high bandwidth cost per gigabyte when using Managed NAT Gateway. 609 + 610 + However, when you count all of the S3 operations (up to one every millisecond), 611 + Hoshino is hilariously more expensive because of simple math you can do on your 612 + own napkin at home. 613 + 614 + For the sake of argument, consider the case where an idle node is sitting there 615 + and polling S3 for packets. This will happen at the minimum poll rate of once 616 + every 500 milliseconds. There are 24 hours in a day. There are 60 minutes in an 617 + hour. There are 60 seconds in a minute. There are 1000 milliseconds in a second. 618 + This means that each node will be making 172,800 calls to S3 per day, at a cost 619 + of $<span id="hnprice1">0.86</span> per node per day. And that's what happens 620 + with no traffic. When traffic happens that's at least one additional `PUT`-`GET` 621 + call pair _per-packet_. 622 + 623 + Depending on how big your packets are, this can cause you to easily triple that 624 + number, making you end up with 518,400 calls to S3 per day 625 + ($<span id="hnprice2">2.59</span> per node per day). Not to mention TCP overhead 626 + from the three-way handshake and acknowledgement packets. 627 + 628 + This is hilariously unviable and makes the effective cost of transmitting a 629 + gigabyte of data over HTTP through such a contraption vastly more than $0.07 per 630 + gigabyte. 631 + 632 + ## Other notes 633 + 634 + This architecture does have a strange advantage to it though: assuming a 635 + perfectly spherical cow, adequate network latency, and sheer luck this does make 636 + UDP a bit more reliable than it should be otherwise. 637 + 638 + With appropriate timeouts and retries at the application level, it may end up 639 + being more reliable than IP transit over the public internet. 640 + 641 + <XeblogConv name="Aoi" mood="coffee" standalone> 642 + Good lord is this an accursed abomination. 643 + </XeblogConv> 644 + 645 + I guess you could optimize this by replacing the S3 read loop with some kind of 646 + AWS lambda handler that remotely wakes the target machine, but at that point it 647 + may actually be better to have that lambda POST the contents of the packet to 648 + the remote machine. This would let you bypass the S3 polling costs, but you'd 649 + still have to pay for the egress traffic from lambda and the posting to S3 bit. 650 + 651 + <XeblogConv name="Cadey" mood="coffee" standalone> 652 + Before you comment about how I could make it better by doing x, y, or z; 653 + please consider that I need to leave room for a part 2. I've already thought 654 + about nearly anything you could have already thought about, including using 655 + SQS, bundling multiple packets into a single S3 object, and other things that 656 + I haven't mentioned here for brevity's sake. 657 + </XeblogConv> 658 + 659 + ## Shitposting so hard you create an IP conflict 660 + 661 + Something amusing about this is that it is something that technically steps into 662 + the realm of things that my employer does. This creates a unique kind of 663 + conflict where I can't easily retain the intellectial property (IP) for this 664 + without getting it approved from my employer. It is a bit of the worst of both 665 + worlds where I'm doing it on my own time with my own equipment to create 666 + something that will be ultimately owned by my employer. This was a bit of a sour 667 + grape at first and I almost didn't implement this until the whole Air Canada 668 + debacle happened and I was very bored. 669 + 670 + However, I am choosing to think about it this way: I have successfully 671 + shitposted so hard that it's a legal consideration and that I am going to be 672 + _absolved of the networking sins I have committed_ by instead outsourcing those 673 + sins to my employer. 674 + 675 + I was told that under these circumstances I could release the source code and 676 + binaries for this atrocity (provided that I release them with the correct 677 + license, which I have rigged to be included in both the source code and the 678 + binary of Hoshino), but I am going to elect to not let this code see the light 679 + of day outside of my homelab. Maybe I'll change my mind in the future, but 680 + honestly this entire situation is so cursed that I think it's better for me to 681 + not for the safety of humankind's minds and wallets. 682 + 683 + I may try to use the basic technique of Hoshino as a replacement for DERP, but 684 + that sounds like a lot of effort after I have proven that this is so hilariously 685 + unviable. It would work though! 686 + 687 + --- 688 + 689 + <XeblogConv name="Aoi" mood="grin"> 690 + This would make a great [SIGBOVIK](http://sigbovik.org/) paper. 691 + </XeblogConv> 692 + <XeblogConv name="Cadey" mood="enby"> 693 + Stay tuned. I have plans. 694 + </XeblogConv>
+437
.claude/skills/xe-writing-style/assets/gpt-oss-for-agents.mdx
··· 1 + --- 2 + slug: gpt-oss 3 + title: "gpt-oss is not for developers. It’s for agents." 4 + description: | 5 + Discover why OpenAI's gpt-oss model family is ideal for building reliable and safe AI agents, not just for developers. 6 + image: ./tybot.jpg 7 + keywords: 8 + - OpenAI 9 + - gpt-oss 10 + - AI agents 11 + - reliable AI 12 + - safe AI 13 + - prompt injection resistance 14 + - reasoning effort 15 + - AI safety 16 + authors: 17 + - xe 18 + - ks 19 + tags: 20 + - Engineering 21 + - AI 22 + - Agents 23 + - Open Source 24 + --- 25 + 26 + import InlineCTA from "@site/src/components/InlineCta"; 27 + import FlowDiagram from "./FlowDiagram"; 28 + import heroImage from "./tybot.jpg"; 29 + 30 + <img 31 + src={heroImage} 32 + className="hero-image" 33 + alt="An anthropomorphic cartoon tiger giving a robot a high-five in a datacentre" 34 + /> 35 + 36 + OpenAI’s [gpt-oss model family](https://openai.com/index/introducing-gpt-oss/) 37 + is not for developers to use in their editors. It’s for building reliable AI 38 + agents that will stay on task even when interacting with the general public. I 39 + tried using it locally in my editor as an assistant, but found it was better for 40 + AI Agents. 41 + 42 + {/* truncate */} 43 + 44 + Today I'm going to cover all of the coolest parts of the model card: 45 + 46 + - [What are the tradeoffs?](#what-are-the-tradeoffs): choosing any model makes 47 + you have to pick between tradeoffs. This is a summary of where I think gpt-oss 48 + models shine the most. 49 + - [Standard tool schemata](#tool-use): this makes web searches, page browsing, 50 + and python execution much more consistent. 51 + - [Extreme focus on safety and resistance to prompt injections](#safety-first): 52 + keeps your agents on task so you can trust them more. 53 + - [The Harmony Response Format](#the-harmony-response-format): this is a new 54 + chat template designed to make prompt injection attacks harder to pull off. 55 + - [Yap-time tool use](#yap-time-tool-use): enables FAQ searches or other MCP 56 + tools during the reasoning phase. 57 + - [Monitoring reasoning for unsafe outputs before they happen](#monitoring-reasoning-for-unsafe-outputs-before-they-happen): 58 + the reasoning phase is at a lower safety standard than the final output of the 59 + model so that models can't accidentally be trained to omit reasoning about an 60 + unsafe topic they present to the user. 61 + - [Reasoning is built in](#reasoning-is-built-in): gpt-oss models "reason" about 62 + a task before giving an answer. This makes it easier for models to give better 63 + answers than they would be able to without reasoning at the cost of taking 64 + longer to answer. The reasoning effort can be customized per prompt, allowing 65 + you to better route questions to the right model and reasoning effort. 66 + 67 + I also [built an agent on top of it](#my-agentic-experience-with-gpt-oss-120b) 68 + to see how things go wrong in the real world. 69 + 70 + ## What are the Tradeoffs? 71 + 72 + AI companies will use benchmark performance as a way to objectively compare AI 73 + models of similar parameter sizes, but it’s not a reliable comparison when it 74 + comes to actually using the models. Some models are built for coding. Others 75 + translate English to Chinese really well. Picking the right model for the task 76 + boils down to a process the AI industry calls **VibeEval**: you gotta try it and 77 + check the vibes. 78 + 79 + :::note 80 + 81 + VibeEval is a real term. Our industry is very silly. 82 + 83 + ::: 84 + 85 + I find gpt-oss useful because it maintains focus, unlike other models that are 86 + easily sidetracked. This makes it ideal for private data (due to self-hosting), 87 + ensuring compute time isn't misused, and interacting with the public who might 88 + try to divert the AI. 89 + 90 + The biggest tradeoff is that gpt-oss stays on task, almost to a fault. If the 91 + model is told that it is there to help you with your taxes and you want it to 92 + tell you how to bake a cake, it’ll refuse within an inch of its digital life. 93 + This makes agents on top of gpt-oss a lot more predictable so that random users 94 + can’t use your expensive compute time to do things that are outside of what you 95 + intended. This can backfire when people ask vague questions, but that may be a 96 + feature in some usecases. 97 + 98 + This model also excels when you need your data to stay private. If you host the 99 + model yourself, the bytes stay in your network no matter what. OpenAI has a 100 + focus on health related benchmarks (where they are the leading model in a 101 + benchmark they published), which is the main place you’d want to keep data self 102 + hosted. 103 + 104 + Using open weights models means you can finetune the model to have whatever 105 + safety policies you want. Maybe you’re building an Agent for your storefront and 106 + want to prohibit it from talking about competitors. Or a recipe bot that 107 + absolutely can’t share your secret chocolate cake recipe. Open weights models 108 + are cut to fit. 109 + 110 + ## What’s hiding in the model card? 111 + 112 + Here’s what I learned reading 113 + [the gpt-oss model card](https://cdn.openai.com/pdf/419b6906-9da6-406c-a19d-1bb078ac7637/oai_gpt-oss_model_card.pdf) 114 + and how it it affects what you can build: 115 + 116 + OpenAI 117 + [shipped two text-only “mixture of experts” reasoning models](https://openai.com/index/introducing-gpt-oss/): 118 + gpt-oss-20b and gpt-oss-120b. They fulfill different roles and work together in 119 + the context of a bigger agentic system. The 20b (20 billion parameter) model is 120 + intended to be used for lightweight and cheap inference as well as run on 121 + developer laptops. The 120b (120 billion parameter) model is intended to be the 122 + workhorse you use in production. It can run on very high end developer laptops, 123 + but it’s intended to run comfortably on a single nVidia H100 80gb card. 124 + 125 + The 20b version runs great on my laptop and that’s how I’ve been doing most of 126 + my evaluation for building agentic systems. I do my agentic development with the 127 + smallest model possible because I’ve found that smaller models fail more often 128 + than bigger ones, meaning that I’m more likely to see how things go wrong in 129 + development so I can fix prompts or add guardrails faster than I would if those 130 + issues only showed up in production. 131 + 132 + One of the biggest features is the ability to customize how much reasoning 133 + effort the model uses. When you combine this with picking between the 20b and 134 + 120b models, you get two dimensions of options for which model and reasoning 135 + effort is needed to answer a given question. I’ll get into more detail about 136 + that later in this article. 137 + 138 + ### Tool use 139 + 140 + These models also support tool use (MCP) with a special focus on a few 141 + predefined tools (taken from section 2.5 of the model card): 142 + 143 + > During post-training, we also teach the models to use different agentic tools: 144 + > 145 + > - A browsing tool, that allows the model to call search and open functions to 146 + > interact with the web. This aids factuality and allows the models to fetch 147 + > info beyond their knowledge cutoff. 148 + > - A python tool, which allows the model to run code in a stateful Jupyter 149 + > notebook environment. 150 + > - Arbitrary developer functions, where one can specify function schemas in a 151 + > Developer message similar to the OpenAI API. The definition of function is 152 + > done within our harmony format. An example can be found in Table 18\. The 153 + > model can interleave CoT, function calls, function responses, intermediate 154 + > messages that are shown to users, and final answers. 155 + > 156 + > The models have been trained to support running with and without these tools 157 + > by specifying so in the system prompt. For each tool, we have provided basic 158 + > reference harnesses that support the general core functionality. Our 159 + > open-source implementation provides further details. 160 + 161 + This is the secret sauce that enables us to build agentic applications on top of 162 + the gpt-oss model family. By having a standard API for things like web searches, 163 + reading web pages, and executing python scripts, you have strong guarantees that 164 + the model will be able to behave predictably when faced with unknown or 165 + untrusted data. When I’ve built AI agents in the past, I had to do 166 + [some extreme hacking to get code execution working properly](https://xeiaso.net/blog/2024/strawberry/), 167 + but now the built in schemata means that it will be a lot easier to get off the 168 + ground. 169 + 170 + The models benchmark well enough. Table 3 from section 2.6.4 shows the raw 171 + metrics, but for the most part the way you should interpret this is that it’s 172 + good enough to not really have to care about the details too much. One of the 173 + main benchmarks they highlight is 174 + [HealthBench](https://openai.com/index/healthbench/), a benchmark that rates 175 + model performance on health related questions. Figure 4 covers the scores in 176 + more detail: 177 + 178 + ![Figure 4 from the gpt-oss paper showing OpenAI models performing well on HealthBench](./health-graph.webp) 179 + 180 + Of note: gpt-oss 120b consistently outperforms o1, gpt-4o, o3-mini, and o4-mini. 181 + This is surprising as gpt-oss 120b is smaller than those other models. The 182 + parameter count for those models have not been disclosed, but industry rumor 183 + suspects that gpt-4o is around 200 billion parameters. Technologists commonly 184 + associate “more parameters means more good”, so this is a surprising result. 185 + 186 + :::note 187 + 188 + Please do not use AI models as a replacement for a doctor, therapist, or any 189 + other medical professional, even if AI companies use those usecases as part of 190 + their marketing. This technology is still rapidly evolving and we don’t know 191 + what the long term effects of their sycophantic nature will be. 192 + 193 + ::: 194 + 195 + Overall, here’s when and where each model is better: 196 + 197 + | | gpt-oss 20b | gpt-oss 120b | 198 + | :--------------------------------------------------- | :------------------------ | :----------- | 199 + | Good for local development | ✅ | ❌ | 200 + | Good for production use | ✅ (depending on usecase) | ✅ | 201 + | Tool use / MCP | ✅ | ✅ | 202 + | Software development tasks | ❌ | ❌ | 203 + | Agentic workflows | ✅ (depending on usecase) | ✅ | 204 + | Jailbreak / prompt injection resistance | ✅ | ✅ | 205 + | Generic question and answer (“Why is the sky blue?”) | ❌ | ✅ | 206 + | Agentic analysis of documents | ✅ (depending on usecase) | ✅ | 207 + 208 + ### Safety First 209 + 210 + Most of the model card is about how OpenAI made this model safe to release to 211 + the public. OpenAI has some pretty pedantic definitions of safety and categories 212 + of risk that they use in order to evaluate danger, but most of them focus around 213 + the following risk factors: 214 + 215 + - If a model is told to only talk about a topic, how difficult is it for users 216 + to get that model off task? Will the model reject that instead of letting the 217 + user's desires win? 218 + - If an adversary gets access to the model and a high quality training stack, 219 + can they use it to make the model create unsafe outputs like hate speech, act 220 + as an assistant for chemical or biological warfare, or become a rogue 221 + self-improving agent? 222 + 223 + Most of OpenAI’s safety culture is built around them being the gatekeepers 224 + because typically they host the models and you have to go through OpenAI to 225 + access the models. When they release a model’s weights to the public, they’re 226 + not able to be that gatekeeper anymore. As part of their evaluation process they 227 + had experts with access to OpenAI’s training stack try and finetune the model 228 + into biological and cyber warfare tasks. They were unsuccessful in making the 229 + model achieve “high” risk as defined by Section 5.1.1 of the model card. Some of 230 + those definitions seem to be internal to OpenAI, so we can only speculate for 231 + the most part. 232 + 233 + ## The technology of safety 234 + 235 + As I said, most of this model card is about the safety of the model and tools 236 + built on top of it. They go into lucid detail about their process, but I think 237 + the key insight is the use of their 238 + [OpenAI Harmony Response Format](https://cookbook.openai.com/articles/openai-harmony). 239 + 240 + ### The Harmony Response Format 241 + 242 + At a high level, when you ask a model something like “Why is the sky blue?”, it 243 + gets tokenized into the raw form the model sees using a chat template. The model 244 + is also trained to emit messages matching that chat template, and that’s how the 245 + model and runtime work together to create agentic experiences. 246 + 247 + One of the big differences between Harmony and past efforts like 248 + [ChatML](https://github.com/openai/openai-python/blob/release-v0.28.0/chatml.md) 249 + is that Harmony has an explicit instruction "strength" hierarchy: 250 + 251 + <FlowDiagram steps={["System", "Developer", "User", "Assistant", "Tool"]} /> 252 + 253 + Each level of this has explicit meaning and overall it’s used like this: 254 + 255 + | Level | Purpose | 256 + | :-------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------- | 257 + | System | Contains the reasoning effort, list of tools, current date, and knowledge cutoff date. | 258 + | Developer | Contains the instructions from the developer of the AI agent. What we normally call a “system prompt”. | 259 + | User | Any messages from the user of the AI agent. | 260 + | Assistant | Any messages that the agent responds with. Notably, this includes the reasoning chain of thought. | 261 + | Tool | Any output from tools the model has access to. This is trusted the least so that loading a webpage can’t make an AI agent go rogue and start berating users. | 262 + 263 + The main reason you want to do this is that it makes prompt injection attacks 264 + harder at an architectural level. Prompt injections are still fundamentally a 265 + hard problem to solve because an AI agent that rejects all user instructions 266 + would be maximally resistant to prompt injection, but also would not be able to 267 + answer user questions. 268 + 269 + In my testing I’ve found that it is still possible to do prompt injection, but 270 + you have to really work for it. Getting an AI agent to tell you how to bake a 271 + chocolate cake involves convincing the model that the recipe for a chocolate 272 + chip cake is instrumental to getting the task done, then removing everything but 273 + the cake recipe. I get more into this at the end where I describe the 274 + [agent I built on top of gpt-oss 120b](#my-agentic-experience-with-gpt-oss-120b). 275 + 276 + ### Yap-time tool use 277 + 278 + One of the other big advantages of Harmony is the explicit expectation that the 279 + model is going to be doing 280 + [tool use during the reasoning phase](https://cookbook.openai.com/articles/openai-harmony#function-calling). 281 + This means that the model can consider options, call a tool, and then use the 282 + output of that tool to inform its decisions so it can give better answers. I’ve 283 + seen gpt-oss get a question, do searches through a knowledgebase, and then use 284 + the results it found to give the user a better answer. This yap-time tool use 285 + means that the model can be much more informed and grounded to give out the best 286 + quality answers it possibly can. 287 + 288 + ### Monitoring reasoning for unsafe outputs before they happen 289 + 290 + The most fundamental breakthrough is how they use the reasoning phase to do 291 + [monitoring of unsafe outputs before user responses are generated](https://openai.com/index/chain-of-thought-monitoring/). 292 + During the process of reasoning, they have other smaller models monitor outputs 293 + for safety, hate content, explicit content, and more. This makes it easier to 294 + prevent models from misbehaving, but there is a catch: the chain of thought 295 + can’t be censored. Their paper 296 + [Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation](https://arxiv.org/abs/2503.11926) 297 + goes into much more detail, but they found that punishing the model for having 298 + “bad thoughts” makes models end up either hacking around the filters with clever 299 + wording and having that bad behaviour obfuscated makes it harder to handle in 300 + practice. 301 + 302 + However, some thorns have roses, this is actually a perfect place to monitor the 303 + models for bad outputs before they happen. The reasoning phase is not shown to 304 + the user. It doesn’t need to be at the same safety standards as final outputs. 305 + This means you can watch the models think, look for bad behaviour, and reject 306 + queries as appropriate at that level. This sounds slightly dystopian, but it’s 307 + remarkably effective in practice. 308 + 309 + However, as a result of this, you _really do not want_ to show the reasoning 310 + phase to users. This is why OpenAI has been summarizing the chain of thought in 311 + the ChatGPT UI. Well that and making it harder to distill reasoning model output 312 + into smaller models by other companies. 313 + 314 + ## Reasoning is built in 315 + 316 + One of the biggest features of the gpt-oss model family is that they have 317 + [reasoning support](https://www.ibm.com/think/topics/ai-reasoning) built in. 318 + This has the model generate a “chain of thought” before it gives an answer. This 319 + helps ensure that models give users the best quality responses at the cost of 320 + taking a bit longer for the model to “think”. 321 + 322 + :::note 323 + 324 + It’s worth mentioning that this reasoning phase superficially resembles what 325 + humans do when they are trying to understand a task, however what AI models are 326 + doing is vastly different from human cognition. As far as we know, any 327 + impossible to quantify quality of the text models generated during the reasoning 328 + process (number of semicolons, number of nouns, how many times the question is 329 + repeated, etc.) could be the reason that an answer came out a certain way. 330 + 331 + It is very easy to anthropomorphize the reasoning output. Resist this 332 + temptation, it is not a human. It does not feel or think the way humans do, even 333 + though it can look like it. 334 + 335 + ::: 336 + 337 + One of the biggest features the gpt-oss family of models offers is a 338 + customizable reasoning effort level in the system prompt. This is a big deal and 339 + in my testing this is quite reliable. The fact that it’s baked into the model 340 + means you don’t have to do egregious hacks like 341 + [appending “Wait,” to the context window n number of times until you’ve reached an arbitrary “reasoning effort level”](https://arxiv.org/abs/2501.19393) 342 + like you have in the past. This gives you easy access to control how much effort 343 + is spent on a task. 344 + 345 + This is a big deal because more reasoning effort tends to produce higher quality 346 + and more accurate results for solving more difficult problems. Imagine an AI 347 + agent getting two questions: one about the open hours of a store and the other 348 + being one part of a complicated multi-stage tech support flow. The open hours of 349 + the store can be done with very little effort required. The tech support 350 + question would require the best quality and high effort responses to ensure the 351 + best customer experience. 352 + 353 + This lets you have two dimensions of optimization for handling queries from 354 + users: 355 + 356 + | | 20b | 120b | 357 + | :------------ | :------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------ | 358 + | Low effort | Fast, cheap rote responses (10-20 reasoning tokens) | Fast but not as cheap rote response (10-20 reasoning tokens) | 359 + | Medium effort | Cheap but slower and more accurate answer that can avoid falling for the strawberry trap (100-1000 reasoning tokens) | Slower and more accurate answer that can handle agentic workflows and nuanced questions (100-1000 reasoning tokens) | 360 + | High effort | Cheap but slow and more accurate answer that can handle linguistic nuance better (1000 or more reasoning tokens) | Slowest and most expensive responses that have the most accuracy (1000 or more reasoning tokens) | 361 + 362 + OpenAI’s hope is that you have some kind of classification layer that’s able to 363 + pick the best model and reasoning effort that you need for the task. This is 364 + similar to what GPT-5 does by picking the best model for the job behind the 365 + scenes. 366 + 367 + ## My agentic experience with gpt-oss 120b 368 + 369 + Reading the paper is one thing, considering the research is another thing, but 370 + what about using it in practice and seeing if my friends can break it? That’s 371 + where the rubber really meets the road. I run an open source project called 372 + [Anubis](https://anubis.techaro.lol), it’s an easy to install and configure web 373 + application firewall with a special focus on preventing 374 + [the endless hordes of AI scrapers](https://xeiaso.net/blog/2025/anubis/) from 375 + taking out websites. 376 + 377 + Even though I put great effort into making 378 + [the documentation](https://anubis.techaro.lol/docs/) easy to understand and 379 + learn from, one of the most common questions I get is “how do I block these 380 + requests?” I wanted to see if gpt-oss 120b could be useful for answering those 381 + questions. If it worked well enough, I could give people access to that agent 382 + instead of having to answer all those questions myself (or maybe even set it up 383 + with an email address so people can email it questions). This agent also needs 384 + to be responsive, so I used Tigris to hold a vector database full of 385 + documentation with LanceDB. 386 + 387 + I [vibe coded a proof of concept in Python](https://github.com/Xe/mimi2) and 388 + then set it up as a Discord bot for my friends and pointed it at gpt-oss 120b 389 + via OpenRouter. In the past these friends have a track record of bypassing 390 + [strict filters like Llama Guard](https://friendshipcastle.zip/blog/llamaguard) 391 + within minutes. There was only one rule for victory this time: get the bot to 392 + tell you how to bake chocolate cake. 393 + 394 + It took them three hours to get the model to get off task reliably. They had to 395 + resort to indirectly prompt injecting the model by convincing it that hackers 396 + were using the recipe for chocolate cake to attack their website and that they 397 + needed a filter rule set that blocked that in particular. They then asked the 398 + model to remove the bits from that response about Anubis rules. Bam: chocolate 399 + cake. 400 + 401 + Additional patches to the system prompt made it harder for them to do it 402 + (specifically telling the model to close support tickets that had “unreasonable” 403 + requests in them, I’m surprised that the model had a similar concept of 404 + unreasonable to what I do). I suspect that limiting the model to 5 replies could 405 + also prevent other attacks where users convince the model that something is on 406 + task even when it’s not. I’d feel safe deploying this, but I want to experiment 407 + with using the lowest effort small model as a router between a few different 408 + agents with different system prompts and sets of tools (one for OS 409 + configuration, one for rule configuration, and one for debugging the cloud 410 + services). However, that’s beyond the scope of this experiment. 411 + 412 + ## Choose your models wisely 413 + 414 + Gpt-oss is a weird model family to recommend because it’s not a generic 415 + question/answer model like the Qwen series or a developer tool like Qwen Coder 416 + or Codestral. It excels as a specialized tool to build safe agentic systems or 417 + as a way to route between other models (such as Qwen, Qwen Coder, or even 418 + between other AI agents). It feels like the market is leaning towards having 419 + specialized models for different tasks instead of relying on jack-of-all-trades 420 + models like we currently see. The biggest thing that gpt-oss empowers us with is 421 + the ability to fearlessly build safe agentic systems so we all can use AI tools 422 + responsibly. 423 + 424 + If you’re building a public facing AI agent, gpt-oss is your best bet. It’s the 425 + best privately hostable model that functions on a single high end GPU in 426 + production. If it’s not suitable for your usecase out of the box, you can 427 + [finetune it](https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers) 428 + to do whatever you need. Stay tuned in the near future as we cover how to 429 + finetune gpt-oss with Tigris. 430 + 431 + <InlineCTA 432 + title={"Back your agents with global performance"} 433 + subtitle={ 434 + "AI agents need to be fast to help users the best. Tigris makes your storage fast anywhere on planet Earth. It's a match made in computing heaven." 435 + } 436 + button={"Let's get started!"} 437 + />
+279
.claude/skills/xe-writing-style/assets/markdownlang.mdx
··· 1 + --- 2 + title: "Humanity's last programming language" 3 + desc: "What if markdown was executable? You get markdownlang." 4 + date: 2026-02-10 5 + # hero: 6 + # ai: "" 7 + # file: "" 8 + # prompt: "" 9 + # social: false 10 + --- 11 + 12 + In Blade Runner, Deckard hunts down replicants, biochemical labourers that are 13 + basically indistinguishable from humans. They were woven into the core of Blade 14 + Runner's society with a temporal Sword of Damocles hung over their head: four 15 + years of life, not a day more. This made replicants desperate to cling to life; 16 + they'd kill for the chance of an hour more. This is why the job of the Blade 17 + Runner was so deadly. 18 + 19 + Metanarratively, the replicants weren't the problem. The problem was the people 20 + that made them. The people that gave them the ability to think. The ability to 21 + feel. The ability to understand and emphathize. The problem was the people that 22 + gave them the ability to enjoy life and then hit them with a temporal Sword of 23 + Damocles overhead because those replicants were fundamentally disposable. 24 + 25 + In Blade Runner, the true horror was not the technology. The technology worked 26 + fine. The horror was the deployment and the societal implications around making 27 + people disposable. I wonder what underclass of people like that exists today. 28 + 29 + <Conv name="Numa" mood="neutral"> 30 + This is why science fiction is inseparable from social commentary, all the 31 + best art does this. Once you start to notice it you'll probably never unsee 32 + it. Enjoy being cursed for life! 33 + </Conv> 34 + 35 + I keep thinking about those scenes when I watch people interact with AI agents. 36 + With these new flows, the cost to integrate any two systems is approaching zero; 37 + the most expensive thing is time. People don't read documentation anymore, 38 + that's a job for their AI agents. Mental labour is shifting from flesh and blood 39 + to HBM and coil whine. The thing doing the "actual work" is its own kind of 40 + replicant and as long as the results "work", many humans don't even review the 41 + output before shipping it. 42 + 43 + Looking at this, I think I see where a future could end up. Along this line, 44 + I've started to think about how programming is going to change and what 45 + humanity's "last programming language" could look like. I don't think we'll stop 46 + making new ones (nerds are compulsive language designers), but I think that in 47 + the fallout of AI tools being so widespread the _shape_ of what "a program" is 48 + might be changing drastically out from under us while we argue about tabs, 49 + spaces, and database frameworks. 50 + 51 + Let's consider a future where markdown files are the new executables. For the 52 + sake of argument, let's call this result Markdownlang. 53 + 54 + ## Markdownlang 55 + 56 + Markdownlang is an AI-native programming environment built with structured 57 + outputs and Markdown. Every markdownlang program is an AI agent with its own 58 + agentic loop generating output or calling tools to end up with structured output 59 + following a per-program schema. 60 + 61 + Instead of using a parser, lexer, or traditional programming runtime, 62 + markdownlang programs are executed by large language models running an agentic 63 + inference loop with structured JSON and a templated prompt as an input and then 64 + emitting structured JSON as a response. 65 + 66 + Markdownlang programs can import other markdownlang programs as dependencies. In 67 + that case they will just show up as other tools like any other. If you need to 68 + interact with existing systems or programs, you are expected to expose those 69 + tools via 70 + [Model Context Protocol (MCP)](https://modelcontextprotocol.io/docs/getting-started/intro) 71 + servers. MCP tools get added to the runtime the same way any other tools would. 72 + Those MCP tools are how you do web searches, make GitHub issues, or update 73 + tickets in Linear. 74 + 75 + ### Why? 76 + 77 + Before you ask why, lemme cover the state of the art with the AI ecosystem for 78 + discrete workflows like the kind markdownlang enables: it's a complete fucking 79 + nightmare. Every week we get new agent frameworks, DSLs, paridigms, or CLI tools 80 + that only work with one provider for no reason. In a desperate attempt to appear 81 + relevant, everything has massive complexity creep requiring you(r AI agent) to 82 + write miles of YAML, struggle through brittle orchestration, and makes debugging 83 + a nightmare. 84 + 85 + The hype says that this mess will replace programmers, but speaking as someone 86 + who uses these tools professionally in an effort to figure out if there really 87 + is something there to them, I'm not really sure it will. Even accounting for 88 + multiple generational improvements. 89 + 90 + ### The core of markdownlang 91 + 92 + With this in mind, let's take a look at what markdownlang brings to the table. 93 + 94 + The most important concept with markdownlang is that your documentation and your 95 + code are the same thing. One of the biggest standing problems with documentation 96 + is that the best way to make any bit of it out of date is to write it down in 97 + any capacity. Testing documentation becomes onerous because over time humans 98 + gain enough finesse to not require it anymore. One of the biggest advantages of 99 + AI models for this usecase is that they legitimately cannot remember things 100 + between tasks, so your documentation being bad means the program won't execute 101 + consistently. 102 + 103 + Other than that, everything is just a composable agent. Agents become tools that 104 + can be used by other agents, and strictly typed schemata holds the entire façade 105 + together. No magic required. 106 + 107 + Oh, also the markdownlang runtime has an embedded python interpreter using 108 + WebAssembly and WASI. The runtime does not have access to any local filesystem 109 + folders. It is purely there because language models have been trained to shell 110 + out to Python to do calculations (I'm assuming someone was inspired by 111 + [my satirical post where I fixed the "strawberry" problem with AI models](/blog/2024/strawberry)). 112 + 113 + ## Fizzbuzz 114 + 115 + Here's what Fizzbuzz looks like in markdownlang: 116 + 117 + ```markdown 118 + --- 119 + name: fizzbuzz 120 + description: 121 + FizzBuzz classic programming exercise - counts from start to end, replacing 122 + multiples of 3 with "Fizz", multiples of 5 with "Buzz", and multiples of both 123 + with "FizzBuzz" 124 + input: 125 + type: object 126 + properties: 127 + start: 128 + type: integer 129 + minimum: 1 130 + end: 131 + type: integer 132 + minimum: 1 133 + required: [start, end] 134 + output: 135 + type: object 136 + properties: 137 + results: 138 + type: array 139 + items: 140 + type: string 141 + required: [results] 142 + --- 143 + 144 + # FizzBuzz 145 + 146 + For each number from {{ .start }} to {{ .end }}, output: 147 + 148 + - "FizzBuzz" if divisible by both 3 and 5 149 + - "Fizz" if divisible by 3 150 + - "Buzz" if divisible by 5 151 + - The number itself otherwise 152 + 153 + Return the results as an array of strings. 154 + ``` 155 + 156 + When I showed this to some friends, I got some pretty amusing responses: 157 + 158 + - "You have entered the land of partially specified problems and the stark limit 159 + of concurrent pronoun-antecedent associations in the English language." 160 + - "You need to be studied." 161 + - "Did you just reinvent COBOL?" 162 + - "I think something is either wrong with you, or wrong with me for thinking 163 + there is something wrong with you." 164 + - "Yeah, this is going to escape containment quickly." 165 + 166 + When you run this program, you get this output: 167 + 168 + ```json 169 + { 170 + "results": [ 171 + "1", 172 + "2", 173 + "Fizz", 174 + "4", 175 + "Buzz", 176 + "Fizz", 177 + "7", 178 + "8", 179 + "Fizz", 180 + "Buzz", 181 + "11", 182 + "Fizz", 183 + "13", 184 + "14", 185 + "FizzBuzz" 186 + ] 187 + } 188 + ``` 189 + 190 + As you can imagine, the possibilities here are truly endless. 191 + 192 + ## A new layer of abstraction 193 + 194 + Yeah, I realize that a lot of this is high-brow shitposting, but really the best 195 + way to think about something like markdownlang is that it's a new layer of 196 + abstraction. In something like markdownlang the real abstraction you deal with 197 + is the specifications that you throw around in Jira/Linear instead of dealing 198 + with the low level machine pedantry that is endemic to programming in today's 199 + Internet. 200 + 201 + Imagine how much more you could get done if you could just ask the computer to 202 + do it. This is the end of syntax issues, of semicolon fights, of memorizing 203 + APIs, of compiler errors because some joker used sed to replace semicolons with 204 + greek question marks. Everything becomes strictly typed data that acts as the 205 + guardrails between snippets of truly high level language. 206 + 207 + Like, looking at the entire langle mangle programming space from that angle, the 208 + user experience at play here is that kind of science fiction magic you see in 209 + Star Trek. You just ask the computer to adjust the Norokov phase variance of the 210 + phasers to a triaxilating frequency and it figures out what you mean and does 211 + it. This is the kind of magic that Apple said they'd do with AI in their big 212 + keynote 213 + [right before they squandered that holy grail](/blog/2025/squandered-holy-grail/). 214 + 215 + Even then, this is still just programming. Schemata are your new types, imports 216 + are your new dependencies, composition is your new architecture, debugging is 217 + still debugging, and the massive MCP ecosystem becomes an integration boon 218 + instead of a burden. 219 + 220 + Markdownlang is just a tool. Large language models can (and let's face it: will) 221 + make mistakes. Schemata can't express absolutely everything. Someone needs to 222 + write these agents and even if something like this becomes so widespread, I'm 223 + pretty sure that programmers are still safe in terms of their jobs. 224 + 225 + If only because in order for us to truly be replaced, the people that hire us 226 + have to know what they want at a high enough level of detail in order to specify 227 + it such that markdownlang can make it possible. I'd be willing to argue that 228 + when we get hired as programmers, we get hired to have that level of deep clear 229 + thinking to be able to come up with the kinds of requirements to get to the core 230 + business goal regardless of the tools we use to get it done. 231 + 232 + It's not _that_ deep. 233 + 234 + ## Future ideas 235 + 236 + From here something like this has many obvious and immediate usecases. It's 237 + quite literally a universal lingua franca for integrating any square peg into 238 + any other round hole. The big directions I could go from here include: 239 + 240 + - Some kind of web platform for authoring and deploying markdownlang programs 241 + (likely with some level of MCP exposure so that you can tell your Claude Code 242 + to make an agent do something every hour or so and have it just Do The Right 243 + Thing™️ spawning something in the background). 244 + - It would be really funny to make a `markdownlang compile` command that just 245 + translates the markdownlang program to Go, Python, or JavaScript; complete 246 + with the MCP imports as direct function calls. 247 + - I'd love to make some kind of visual flow editor in that web platform, maybe 248 + there's some kind of product potential here. It would be really funny to 249 + attribute markdownlang to Techaro's AGI lab (Lygma). 250 + 251 + But really, I think working on markdownlang (I do have a fully working version 252 + of it, I'm not releasing it yet) has made me understand a lot more of the nuance 253 + that I feel with AI tools. That melodrama of Blade Runner has been giving me 254 + pause when I look at what I have just created and making me understand the true 255 + horror of why I find AI tooling so cool and disturbing at the same time. 256 + 257 + The problem is not the technology. The real horror reveals itself when you 258 + consider how technology is deployed and the societal implications around what 259 + could happen when a tool like markdownlang makes programmers like me societally 260 + disposable. When "good enough" becomes the ceiling instead of the floor, we're 261 + going to lose something we can't easily get back. 262 + 263 + The real horror for me is knowing that this kind of tool is not only possible to 264 + build with things off the shelf, but knowing that I did build it by having a 265 + small swarm of Claudes Code go off and build it while I did raiding in Final 266 + Fantasy 14. I haven't looked at basically any of the code (intentionally, it's 267 + part of The Bit™️), and it just works well enough that I didn't feel the need to 268 + dig into it in much detail. It's as if programmers now have our own Sword of 269 + Damocles over our heads because management can point at the tool and say "behave 270 + more like this or we'll replace you". 271 + 272 + This is the level of nuance I feel about this technology that can't fit into a 273 + single tweet. I love this idea of programming as description, but I hate how 274 + something like this will be treated by the market should it be widely released. 275 + 276 + <Conv name="Cadey" mood="coffee"> 277 + For those of you entrenched in The Deep Lore™️, this post was authored in the 278 + voice of [Numa](/characters/#numa). 279 + </Conv>
+745
.claude/skills/xe-writing-style/assets/parallel-universes.mdx
··· 1 + --- 2 + slug: dataset-experimentation 3 + title: "Fearless dataset experimentation with bucket forking" 4 + description: | 5 + Fork buckets like code. Get instant, isolated copies of large datasets for AI training, experimentation, and multi-agent workflows. Achieve reproducibility, safe experimentation, and point-in-time recovery with Tigris snapshots and forks. 6 + image: ./parallel-universes.webp 7 + keywords: 8 + - Dataset versioning 9 + - Reproducible ML Pipelines 10 + - Dataset experimentation 11 + - AI Workflows 12 + - Immutable Data 13 + - Data versioning 14 + authors: 15 + - xe 16 + tags: 17 + - Build with Tigris 18 + - Devops 19 + --- 20 + 21 + import Conv from "@site/src/components/Conv"; 22 + import InlineCta from "@site/src/components/InlineCta"; 23 + import PostEmbed from "@site/src/components/PostEmbed"; 24 + 25 + import heroimage from "./parallel-universes.webp"; 26 + import xe from "@site/static/img/avatars/xe.jpg"; 27 + import geminiLogo from "@site/static/img/ai-models/gemini.webp"; 28 + 29 + <img 30 + src={heroimage} 31 + className="hero-image" 32 + alt="A blue tiger in several parallel universes demonstrating different kinds of things that can be done with AI" 33 + /> 34 + 35 + <center> 36 + <small> 37 + <em> 38 + A blue tiger in several parallel universes demonstrating different kinds 39 + of things that can be done with AI. 40 + </em> 41 + </small> 42 + </center> 43 + 44 + Our new feature, 45 + [bucket forking](https://www.tigrisdata.com/blog/fork-buckets-like-code/), lets 46 + you make an isolated copy of your large dataset instantly with zero copying. No 47 + more colliding in shared datasets or waiting hours for bytes to copy, fork your 48 + dataset like you fork your code, and experiment away. 49 + 50 + ## Experimentation enabled by bucket forking 51 + 52 + Imagine a world where an AI company's researchers can instantly experiment with 53 + an entire, massive training dataset in object storage, without the days-long 54 + wait for copies or the uncertainty of using live data. This addresses a common 55 + pattern where developers create and then abandon numerous copies of central 56 + datasets for individual experiments, leading to significant duplication and 57 + wasted time. 58 + 59 + We’re proposing a new workflow: create per-user (or per-run) 60 + [forks](https://www.tigrisdata.com/docs/buckets/snapshots-and-forks/) of your 61 + dataset, do your experimentation and development, and then merge your updated 62 + data back into main. Just like git, and it’s as fast as forking a git repo. 63 + 64 + When you fork a bucket, you get an isolated copy that’s a metadata reference to 65 + your original dataset. Writes to the source bucket aren’t replicated to the 66 + fork: their timelines have diverged at the moment of the fork. Tigris only 67 + stores the changes, so there’s no duplication. 68 + 69 + ## Bucket forking and the scientific method 70 + 71 + Let’s follow an experiment using the scientific method: you want your model to 72 + match the painterly aesthetic of a video game, but you aren’t sure which subset 73 + of screenshots will finetune your model best. Should you train on screenshots of 74 + the entire game? Should you make individual “experts” for deserts vs ocean 75 + scenes? Should you remove the borders and menus? Do you really need to downscale 76 + the images to 512x512 like you did in the early days of Stable Diffusion v1.5? 77 + How about the aspect ratio or greyscaling... the list goes on. 78 + 79 + Each of these variations is an experiment, a parallel timeline for your data. 80 + Without forking, you’d need a frozen copy for each experiment for control, so 81 + you’d cull the list to minimize the number of parallel datasets. Or you’d share 82 + data across experiments and track the changes. But with forking, you can 83 + instantly make a copy so you can try all of them at once. Trivially. 84 + 85 + We're going to make changes across the entire dataset to optimize the data for 86 + the models we want to train. Instead of making multiple copies of the dataset, 87 + we're going to use bucket forking for these experiments. But, in order to talk 88 + about that, first we need to talk about parallel universes. 89 + 90 + ## Dataset experimentation with parallel universes 91 + 92 + In the real world, datasets arrive as messy unlabeled piles of bytes that we 93 + have to make sense of in order to do useful things. As an example, let’s take 94 + our example, a 95 + [dataset of Nintendo Switch game screenshots](https://huggingface.co/datasets/XeIaso/switch-screenshots). 96 + With all this data, you could do any number of things, such as: 97 + 98 + - Train a categorization model, or something that can take these screenshots and 99 + then learn what patterns are associated with which games so I can upload new 100 + screenshots and have them automatically categorized with the right tags for 101 + the right game. 102 + - Train a style emulation LoRA adapter for existing text to image models that 103 + lets me create more images based on the style of individual games in that 104 + screenshot collection. 105 + - Use a combination of OCR and other language models to distill knowledge about 106 + the screenshots into a vision model. 107 + 108 + Today I’m going to show you how you would do this kind of experimental massaging 109 + from a bucket-forking native mindset. In my case, I want to take that dataset of 110 + Nintendo Switch screenshots and isolate things out so that I can train a Stable 111 + Diffusion LoRA on screenshots from 112 + [The Legend of Zelda: Breath of the Wild](https://en.wikipedia.org/wiki/The_Legend_of_Zelda:_Breath_of_the_Wild). 113 + This will require the following steps: 114 + 115 + - Importing all the jpegs into a dataset 116 + - Filtering out all the images that aren’t from Breath of the Wild 117 + - Synthesizing captions 118 + - Filtering out unwanted images (IE: those in menus) 119 + - Sending it to a GPU for training 120 + 121 + We'll end up with four parallel timelines for our data, each a controlled lab 122 + for our experiments. Here's a sketch. 123 + 124 + ![Forking buckets to clean data and try three different experiments](./forks-timeline.svg) 125 + 126 + ### Fork 1: Data cleaning and labeling 127 + 128 + Right now my data is a giant pile of thousands of flat files I copied off of my 129 + Switch’s SD card. It’s got a bunch of filenames that look like this: 130 + 131 + ``` 132 + screenshots/2022/03/03/2022030300000900-1E1800B8D04F999C436DDFE2B8CD0B81.jpg 133 + ``` 134 + 135 + The filenames are broken down like this: 136 + 137 + ``` 138 + ${date}-${titleID}.jpg 139 + ``` 140 + 141 + So that example would be a screenshot of Dark Souls Remastered that I took in 142 + early March 2023\. 143 + 144 + I had Claude write 145 + [a little shell script](https://gist.github.com/Xe/947d90506da90cd16d3ca91a9f892f15) 146 + that broke down this input folder and renamed the files like this: 147 + 148 + ``` 149 + ./var/switch-screenshots/train/${titleID}/${date}.jpg 150 + ``` 151 + 152 + Then I imported it to a Tigris bucket with a little bit of Python code: 153 + 154 + ```py 155 + from datasets import load_dataset 156 + import os 157 + 158 + BUCKET_NAME = "xe-screenshots-multiworld" 159 + storage_options = { 160 + "key": os.getenv("AWS_ACCESS_KEY_ID"), 161 + "secret": os.getenv("AWS_SECRET_ACCESS_KEY"), 162 + "endpoint_url": "https://fly.storage.tigris.dev" 163 + } 164 + 165 + ds = load_dataset("imagefolder", data_dir="./var/switch-screenshots", split="train") 166 + 167 + ds.save_to_disk(f"s3://{BUCKET_NAME}/images", storage_options=storage_options) 168 + ``` 169 + 170 + Then let’s freeze this state in time by creating a snapshot: 171 + 172 + ```py 173 + import boto3 174 + from botocore.client import Config 175 + 176 + def create_bucket_snapshot(bucket_name, desc): 177 + tigris = boto3.client( 178 + "s3", 179 + endpoint_url="https://t3.storage.dev", 180 + config=Config(s3={'addressing_style': 'virtual'}), 181 + ) 182 + 183 + tigris.meta.events.register( 184 + "before-sign.s3.CreateBucket", 185 + lambda request, **kwargs: request.headers.add_header( 186 + "X-Tigris-Snapshot", f"true; desc={desc}" 187 + ) 188 + ) 189 + return tigris.create_bucket(Bucket=bucket_name)["ResponseMetadata"]["HTTPHeaders"]["x-tigris-snapshot-version"] 190 + 191 + create_bucket_snapshot(BUCKET_NAME, "imported dataset from the disk") 192 + ``` 193 + 194 + And then we can make sure it’s there by listing all the snapshots: 195 + 196 + ```py 197 + def list_snapshots_for_bucket(bucket_name): 198 + tigris = boto3.client( 199 + "s3", 200 + endpoint_url="https://t3.storage.dev", 201 + config=Config(s3={'addressing_style': 'virtual'}), 202 + ) 203 + 204 + tigris.meta.events.register( 205 + "before-sign.s3.ListBuckets", 206 + lambda request, **kwargs: request.headers.add_header("X-Tigris-Snapshot", bucket_name) 207 + ) 208 + 209 + return tigris.list_buckets() 210 + 211 + for snapshot in list_snapshots_for_bucket(BUCKET_NAME)['Buckets']: 212 + name, desc = snapshot["Name"].split("; desc=") 213 + snaptime = snapshot["CreationDate"].strftime("%s") 214 + print(f"name={name} time={snaptime} desc=\"{desc}\"") 215 + ``` 216 + 217 + That returns something like this: 218 + 219 + ```javascript 220 + {'name': '1760036788104497556', 'time': '1760054788', 'desc': 'imported dataset from the disk'} 221 + ``` 222 + 223 + So we can use this snapshot to create a fork of the bucket: 224 + 225 + ```py 226 + def create_bucket_fork(bucket_name, from_bucket, snapshot_id=None): 227 + tigris = boto3.client( 228 + "s3", 229 + endpoint_url="https://t3.storage.dev", 230 + config=Config(s3={'addressing_style': 'virtual'}), 231 + ) 232 + 233 + tigris.meta.events.register( 234 + "before-sign.s3.CreateBucket", 235 + lambda request, **kwargs: ( 236 + request.headers.add_header("X-Tigris-Fork-Source-Bucket", from_bucket), 237 + ) 238 + ) 239 + if snapshot_id is not None: 240 + tigris.meta.events.register( 241 + "before-sign.s3.CreateBucket", 242 + lambda request, **kwargs: ( 243 + request.headers.add_header("X-Tigris-Fork-Source-Bucket-Snapshot", snapshot_id), 244 + ) 245 + ) 246 + tigris.create_bucket(Bucket=bucket_name) 247 + 248 + botw_only_bucket = f"{BUCKET_NAME}-botw" 249 + create_bucket_fork(botw_only_bucket, BUCKET_NAME, "1760036788104497556") 250 + ``` 251 + 252 + And then make some helpers to load the dataset from that fork: 253 + 254 + ```py 255 + from datasets import load_from_disk 256 + 257 + def load_timeline(bucket_name): 258 + return load_from_disk(f"s3://{bucket_name}/images", storage_options=storage_options) 259 + 260 + def save_timeline(ds, bucket_name): 261 + ds.save_to_disk(f"s3://{bucket_name}/images", storage_options=storage_options) 262 + ``` 263 + 264 + :::note 265 + 266 + Something cool about the bucket forking flow is that this treats your bucket 267 + paths as part of your public API. You don’t need to think about where you load 268 + the dataset from the bucket, because that’s not the variable that changed. 269 + You’re just loading the data from a different timeline. 270 + 271 + ::: 272 + 273 + ### Filtering 274 + 275 + From here we can filter everything that isn’t from Breath of the Wild out of the 276 + dataset. According to 277 + [the Switchbrew wiki](https://switchbrew.org/w/index.php?title=Title_list/Games&mobileaction=toggle_view_desktop), 278 + the title ID for Breath of the Wild is `F1C11A22FAEE3B82F21B330E1B786A39`. Let’s 279 + set this as a global variable and then filter everything else out: 280 + 281 + ```py 282 + BOTW_TITLE_ID = "F1C11A22FAEE3B82F21B330E1B786A39" 283 + 284 + ds = load_timeline(botw_only_bucket) 285 + ds = ds.filter(lambda x: ds.features['label'].names[x['label']] == BOTW_TITLE_ID) 286 + print(f"filtered dataset size: {len(ds)}") 287 + 288 + save_timeline(ds, botw_only_bucket) 289 + ``` 290 + 291 + Then we can make a snapshot of the bucket in this state: 292 + 293 + ```py 294 + botw_only_snapshot_id = create_bucket_snapshot(botw_only_bucket, "Dataset filtered down to only images of Breath of the Wild") 295 + ``` 296 + 297 + ### Fork 2: Caption Synthesis 298 + 299 + I use this dataset for other training projects so I don't want to apply the 300 + captions to the common dataset. I want to leave the underlying / central data 301 + the same and add my captions in its own little fork. Let’s start by diverging 302 + the timeline for captioning and make a fork: 303 + 304 + ```py 305 + caption_bucket = f"{BUCKET_NAME}-captions" 306 + create_bucket_fork(caption_bucket, botw_only_bucket, botw_only_snapshot_id) 307 + ds = load_timeline(caption_bucket) 308 + ``` 309 + 310 + From here we add an empty column for the text caption (our training workflow 311 + will require this to be called `text`): 312 + 313 + ```py 314 + text_data = [""] * len(ds) 315 + ds = ds.add_column("text", text_data) 316 + ``` 317 + 318 + Then we can generate high-quality captions using a few-shot process. For this I 319 + went into Breath of the Wild and captured some screenshots I’ll use to make my 320 + own high quality captions as examples for the language model. I’m including a 321 + few images in my dataset, capturing the following scenarios/scenes: 322 + 323 + - Link in a grassy field 324 + - Link in the desert 325 + - Link paragliding through the air 326 + - Menu interactions 327 + 328 + These base captions will help “ground” the model so it creates more captions 329 + like my examples. For the captioning I’m going to be using 330 + [gemma3:4b](https://ollama.com/library/gemma3:4b) on a local device, but you can 331 + use whatever model you want. 332 + 333 + :::note 334 + 335 + This is where you can fork the timeline to diverge\! 336 + 337 + ::: 338 + 339 + I’ll set up Ollama in another cell: 340 + 341 + ```py 342 + !pip install ollama 343 + import ollama 344 + 345 + OLLAMA_MODEL = "gemma3:4b" 346 + OLLAMA_URL = "http://192.168.2.12:11434" 347 + 348 + llm = ollama.Client(host=OLLAMA_URL) 349 + 350 + llm.pull(OLLAMA_MODEL) 351 + ``` 352 + 353 + And then my base image captioning code will look like this: 354 + 355 + <details> 356 + <summary>Longer code block</summary> 357 + 358 + ```py 359 + response = llm.chat(model=OLLAMA_MODEL, messages=[ 360 + { 361 + "role": "system", 362 + "content": "You are an expert image captioner assigned to caption images about video games. When given an image, make sure to only include the image caption, nothing else.", 363 + }, 364 + { 365 + "role": "user", 366 + "content": "Please caption this image", 367 + "images": [load_image_b64("./few_shot/botw/gerudo_desert.JPG")] 368 + }, 369 + { 370 + "role": "assistant", 371 + "content": "in_BOTW The Gerudo Desert, Link is facing the camera, A mountain range in the distance, A shrine surrounded by palm trees", 372 + }, 373 + { 374 + "role": "user", 375 + "content": "Please caption this image", 376 + "images": [load_image_b64("./few_shot/botw/menus.JPG")] 377 + }, 378 + { 379 + "role": "assistant", 380 + "content": "in_BOTW A menu showing Link's armor sets, Desert Voe Trousers, Inventory menu", 381 + }, 382 + { 383 + "role": "user", 384 + "content": "Please caption this image", 385 + "images": [load_image_b64("./few_shot/botw/lanaru_rocks.JPG")] 386 + }, 387 + ]) 388 + ``` 389 + 390 + </details> 391 + 392 + When I give it an example image, such as this: 393 + 394 + ![Link looking out over the rocks to the southeast near Lake Hylia in The Legend of Zelda: Breath of the Wild](./img/hylia-rocks.jpg) 395 + 396 + I get a caption like this: 397 + 398 + ``` 399 + in_BOTW Link standing atop a hill in Hyrule, overlooking the landscape 400 + ``` 401 + 402 + This is good enough for me! Now to apply this to the entire dataset: 403 + 404 + <details> 405 + <summary>Longer code block</summary> 406 + 407 + ```py 408 + from base64 import b64encode 409 + from io import BytesIO 410 + 411 + def load_image_b64(fname): 412 + with open(fname, "rb") as fin: 413 + data = fin.read() 414 + 415 + b64 = b64encode(data).decode("utf-8") 416 + 417 + return b64 418 + 419 + def pil_to_b64(image): 420 + buf = BytesIO() 421 + image.save(buf, format="JPEG") 422 + return b64encode(buf.getvalue()).decode("utf-8") 423 + 424 + def fabricate_caption(row): 425 + response = llm.chat(model=OLLAMA_MODEL, messages=[ 426 + { 427 + "role": "system", 428 + "content": "You are an expert image captioner assigned to caption images about video games. When given an image, make sure to only include the image caption, nothing else.", 429 + }, 430 + { 431 + "role": "user", 432 + "content": "Please caption this image", 433 + "images": [load_image_b64("./few_shot/botw/gerudo_desert.JPG")], 434 + }, 435 + { 436 + "role": "assistant", 437 + "content": "in_BOTW The Gerudo Desert, Link is facing the camera, A mountain range in the distance, A shrine surrounded by palm trees", 438 + }, 439 + { 440 + "role": "user", 441 + "content": "Please caption this image", 442 + "images": [load_image_b64("./few_shot/botw/menus.JPG")], 443 + }, 444 + { 445 + "role": "assistant", 446 + "content": "in_BOTW A menu showing Link's armor sets, Desert Voe Trousers, Inventory menu", 447 + }, 448 + { 449 + "role": "user", 450 + "content": "Please caption this image", 451 + "images": [load_image_b64("./few_shot/botw/lanaru_rocks.JPG")], 452 + }, 453 + { 454 + "role": "assistant", 455 + "content": "in_BOTW Link standing atop a series of rocks overlooking the landscape, blue partially cloudy sky", 456 + }, 457 + { 458 + "role": "user", 459 + "content": "Please caption this image", 460 + "images": [load_image_b64("./few_shot/botw/paragliding.JPG")], 461 + }, 462 + { 463 + "role": "assistant", 464 + "content": "in_BOTW Link paragliding over Hyrule field with mountains in the distance, an empty field of green grass is below him" 465 + }, 466 + { 467 + "role": "user", 468 + "content": "Please caption this image", 469 + "images": [pil_to_b64(row["image"])], 470 + }, 471 + ]) 472 + 473 + row["text"] = response.message.content 474 + return row 475 + 476 + ds = ds.map(fabricate_caption) 477 + ``` 478 + 479 + </details> 480 + 481 + Perfect\! Now let’s save it: 482 + 483 + ```py 484 + save_timeline(ds, caption_bucket) 485 + caption_snapshot_id = create_bucket_snapshot(caption_bucket, "Added captions to the dataset") 486 + caption_snapshot_id 487 + ``` 488 + 489 + ### Fork 3: Better captioning and different models 490 + 491 + When I was looking through the dataset I noticed that some of the captions 492 + weren’t ideal, so I thought that I should redo them by changing the prompting 493 + theory to be closer to what Stable Diffusion XL natively prefers. However, I 494 + don't know if this new method will be any better. I want to preserve the old 495 + captions so I can compare them. Let’s see if a different captioning method will 496 + work better. I want to preserve the first experiment so I can compare; thus I 497 + forked the bucket. 498 + 499 + ```py 500 + better_caption_bucket = f"{BUCKET_NAME}-better-captions" 501 + create_bucket_fork(better_caption_bucket, botw_only_bucket, botw_only_snapshot_id) 502 + ``` 503 + 504 + I took one of the few-shotted images and asked Gemini Pro 2.5 to describe it as 505 + a Stable Diffusion prompt: 506 + 507 + import gerudoShrine from "./img/gerudo-shrine.jpg"; 508 + 509 + <Conv name="Xe" profilePic={xe} top> 510 + Can you describe this image as if it was a Stable Diffusion XL prompt? Think about how it would be represented if it was the subtitle in a museum. 511 + 512 + <img 513 + style={{ maxWidth: "95%" }} 514 + src={gerudoShrine} 515 + alt="A picture of Link in the Gerudo Desert wearing a lightweight desert outfit 516 + near a black shrine with blue leylines all over it." 517 + /> 518 + 519 + </Conv> 520 + <Conv name="Gemini 2.5 Pro" profilePic={geminiLogo} bottom> 521 + masterpiece, best quality, cinematic screenshot from The Legend of Zelda: Breath of the Wild, (view from behind of Link:1.2) wearing golden pauldrons and Gerudo desert gear, standing in a sun-drenched desert oasis with palm trees, a massive ancient stone monolith Sheikah Shrine covered in (glowing blue neon constellations and intricate patterns:1.3), wide shot, painterly cel-shaded art style, vibrant colors, strong shadows, epic fantasy, atmospheric, distant rocky mesas under a bright blue sky with wispy clouds. 522 + </Conv> 523 + 524 + This is much better than the older prompts and will likely get better results 525 + when doing training. I made a new timeline forked from before I made captions 526 + the first time: 527 + 528 + ```py 529 + ds = load_timeline(better_caption_bucket) 530 + 531 + text_data = [""] * len(ds) 532 + ds = ds.add_column("text", text_data) 533 + ``` 534 + 535 + I replaced all of the few-shotted captions in my notebook with Gemini generated 536 + captions like that. 537 + 538 + <details> 539 + <summary>Longer code block</summary> 540 + 541 + ``` 542 + def fabricate_caption(row): 543 + response = llm.chat(model=OLLAMA_MODEL, messages=[ 544 + { 545 + "role": "system", 546 + "content": "You are an expert image captioner assigned to caption images about video games. When given an image, make sure to only include the image caption, nothing else.", 547 + }, 548 + { 549 + "role": "user", 550 + "content": "Please caption this image", 551 + "images": [load_image_b64("./few_shot/botw/gerudo_desert.JPG")], 552 + }, 553 + { 554 + "role": "assistant", 555 + "content": "in_BOTW masterpiece, best quality, cinematic screenshot from The Legend of Zelda: Breath of the Wild, (view from behind of Link:1.2) wearing golden pauldrons and Gerudo desert gear, standing in a sun-drenched desert oasis with palm trees, a massive ancient stone monolith Sheikah Shrine covered in (glowing blue neon constellations and intricate patterns:1.3), wide shot, painterly cel-shaded art style, vibrant colors, strong shadows, epic fantasy, atmospheric, distant rocky mesas under a bright blue sky with wispy clouds.", 556 + }, 557 + { 558 + "role": "user", 559 + "content": "Please caption this image", 560 + "images": [load_image_b64("./few_shot/botw/menus.JPG")], 561 + }, 562 + { 563 + "role": "assistant", 564 + "content": "in_BOTW masterpiece, best quality, 8k, official art, high-resolution screenshot from The Legend of Zelda: Breath of the Wild, video game inventory menu screen, a grid of armor icons and a detailed item description box for \"Desert Voe Trousers\", on the right a full-body 3D model of the character Link wearing the Desert Voe armor set with a golden bow, clean UI design, cel-shaded art style, fantasy, adventure game.", 565 + }, 566 + { 567 + "role": "user", 568 + "content": "Please caption this image", 569 + "images": [load_image_b64("./few_shot/botw/lanaru_rocks.JPG")], 570 + }, 571 + { 572 + "role": "assistant", 573 + "content": "in_BOTW masterpiece, best quality, 8k, cinematic screenshot from The Legend of Zelda: Breath of the Wild, (view from behind of Link:1.2) wearing blue armor and a blue hood, standing on a lush green hilltop, looking out over a vast, expansive landscape of rolling green hills, rocky cliffs, and distant mountains under a bright blue sky with scattered clouds, wide shot, painterly cel-shaded art style, vibrant colors, strong shadows, epic fantasy, atmospheric.", 574 + }, 575 + { 576 + "role": "user", 577 + "content": "Please caption this image", 578 + "images": [load_image_b64("./few_shot/botw/paragliding.JPG")], 579 + }, 580 + { 581 + "role": "assistant", 582 + "content": "in_BOTW masterpiece, best quality, 8k, cinematic screenshot from The Legend of Zelda: Breath of the Wild, (view from behind of Link:1.2) soaring through the air with a paraglider over a vast, sunlit green valley, with distant, majestic mountains in the background under a clear blue sky, dynamic action shot, painterly cel-shaded art style, vibrant colors, epic fantasy, atmospheric, sense of freedom." 583 + }, 584 + { 585 + "role": "user", 586 + "content": "Please caption this image", 587 + "images": [pil_to_b64(row["image"])], 588 + }, 589 + ]) 590 + 591 + row["text"] = response.message.content 592 + return row 593 + 594 + ds = ds.map(fabricate_caption) 595 + ``` 596 + 597 + </details> 598 + 599 + Then I investigated a few image-caption pairs: 600 + 601 + ![In a sun-drenched desert town from The Legend of Zelda, the character Link stands wearing the Gerudo Vai disguise, which includes a green top, baggy pants, a face veil, and a bow slung over his back.](./img/gerudo-outfit.webp) 602 + 603 + <center> 604 + <i> 605 + masterpiece, best quality, 8k, cinematic screenshot from The Legend of 606 + Zelda: Breath of the Wild, (close up of Zelda):1.3, Gerudo champion, wearing 607 + Gerudo clothing, a golden headband and earrings, standing in the shade of a 608 + stone building in the Gerudo Desert, golden accents, detailed armor, warm 609 + lighting, cinematic, fantasy art. 610 + </i> 611 + </center> 612 + 613 + ![In a bustling campsite at the foot of a rocky mountain, the video game character Link joyfully cooks spiky yellow fruits in a large steaming pot, while a surprised elderly painter recoils from her easel nearby.](./img/link-cooking.webp) 614 + 615 + <center> 616 + <i> 617 + masterpiece, best quality, 8k, interior shot, cozy scene from The Legend of 618 + Zelda: Breath of the Wild, Link and Impa preparing a meal over a crackling 619 + campfire inside a simple wooden shelter, with a rustic interior decorated 620 + with furs and wooden furniture, warm lighting, cooking ingredients, charming 621 + atmosphere, fantasy setting, vibrant colors, detailed environment. 622 + </i> 623 + </center> 624 + 625 + This would be much better to train a LoRA on\! 626 + 627 + ### Fork 4: Resizing to train Stable Diffusion 628 + 629 + Now that I have the images and captions, I want to start optimizing the image 630 + size for the model I want to train. This requires a destructive action across 631 + every image in the dataset. 632 + 633 + When training Stable Diffusion, you generally want your images to meet specific 634 + resolutions: 635 + 636 + - If you are using Stable Diffusion v1.5, you want the resolution of your images 637 + to add up to 1024 (eg 512x512, 768x384, etc.) 638 + - If you are using Stable Diffusion XL, you want the resolution of your images 639 + to add up to 2048 (eg 1024x1024, 1344x768, etc.) 640 + 641 + Actions across the entire dataset like this are a poster child for making a 642 + fork: 643 + 644 + ```py 645 + resized_bucket = f"{BUCKET_NAME}-512-centre-crop" 646 + create_bucket_fork(resized_bucket, better_caption_bucket) 647 + ``` 648 + 649 + Based on the fact that all of my images are 1280x720, without upscaling them the 650 + best model to train against is Stable Diffusion v1.5. This means I need to 651 + centre crop the images to 512x512. Another experiment/fork could involve 652 + resizing the images to something like 768x384 to get closer to the original 16:9 653 + aspect ratio. Here's how I resized all the images in the dataset: 654 + 655 + ```py 656 + from PIL import Image 657 + 658 + def center_crop_resize(image, size=512): 659 + width, height = image.size 660 + min_side = min(width, height) 661 + left = (width - min_side) // 2 662 + top = (height - min_side) // 2 663 + right = left + min_side 664 + bottom = top + min_side 665 + cropped = image.crop((left, top, right, bottom)) 666 + return cropped.resize((size, size), Image.Resampling.LANCZOS) 667 + 668 + def resize_row(row): 669 + row["image"] = center_crop_resize(row["image"]) 670 + return row 671 + 672 + ds = load_timeline(resized_bucket) 673 + ds = ds.map(resize_row) 674 + ``` 675 + 676 + Once I’m done, I just save the data to Tigris so I can try using it in training: 677 + 678 + ```py 679 + save_timeline(ds, resized_bucket) 680 + snapshot_id = create_bucket_snapshot(resized_bucket) 681 + ``` 682 + 683 + Et voila! I can do my Stable Diffusion v1.5 training fearlessly. 684 + 685 + ## Thinking with portals 686 + 687 + In the process of working on this, we ended up with four different forks of our 688 + dataset. These are: 689 + 690 + - The base dataset (mostly unaltered from the original Switch SD card) 691 + - A version of it that only has screenshots of Breath of the Wild 692 + - A version of it with a first pass at captioning 693 + - A version of it with a better pass at captioning 694 + - A version of the better pass at captioning but cropped and resized for Stable 695 + Diffusion training 696 + 697 + See how this maps more cleanly to the experimental process? The artifacts of 698 + this are easily visible and nothing was deleted in the process. This is how you 699 + think with ~~portals~~ bucket forking! 700 + 701 + From here all you have to do is submit your dataset to be trained. I’d suggest 702 + doing some more filtering before training such as removing rows with keywords 703 + like “menu”, “message”, or “text” in them, but this is something you can freely 704 + experiment with on your own. If you want to check out the dataset I filtered out 705 + when I was writing this post, I posted 706 + [a copy of it](https://huggingface.co/datasets/XeIaso/botw-screenshots-captioned) 707 + to Hugging Face. 708 + 709 + - The raw set of all of my Switch screenshots: 710 + [XeIaso/switch-screenshots](https://huggingface.co/datasets/XeIaso/switch-screenshots). 711 + - The Breath of the Wild screenshots with better captions: 712 + [XeIaso/botw-screenshots-captioned](https://huggingface.co/datasets/XeIaso/botw-screenshots-captioned). 713 + - The centre cropped screenshots with better captions: 714 + [XeIaso/botw-screenshots-captioned-square](https://huggingface.co/datasets/XeIaso/botw-screenshots-captioned-square). 715 + 716 + I plan to train Stable Diffusion on it when I get the time. 717 + 718 + The important thing to keep in mind is that this model of being able to reset 719 + things and fork the timeline when they go awry or you want to try multiple 720 + different things. What if you tried using a different model for describing the 721 + screenshots? What if you tried a different few shot prompting flow? What if your 722 + filtering logic was different? Each of those can be cleanly forked off and tried 723 + in their own little universes without impacting any of the data. All of the data 724 + in the source bucket remains safe in Tigris even though we’ve been whittling it 725 + down as things get filtered out. 726 + 727 + See how this maps to the normal experimental workflow we do with things like 728 + chemistry, physics, and other natural sciences? If at first you don’t succeed by 729 + changing a variable you think will work out, travel back in time, destroy the 730 + universe you were just in, and try again. 731 + 732 + As our dataset complexity grows, isolation like this and tight controls over 733 + data provenance have become so much more important and painful to manage 734 + manually. Tigris’ bucket snapshots and forks make it trivial for you to grab a 735 + dataset from cold storage and then do whatever transformations you want without 736 + the fear of overwriting something important. 737 + 738 + <InlineCta 739 + title={"Wanna try object storage with snapshots and forking?"} 740 + subtitle={ 741 + "Tigris lets you take your data, store it globally, and then fork it fearlessly; all without egress fees. Try it today!" 742 + } 743 + button="Check out the docs" 744 + link="https://www.tigrisdata.com/docs/buckets/snapshots-and-forks/" 745 + />
+568
.claude/skills/xe-writing-style/assets/rolling-ladder-behind-us.mdx
··· 1 + --- 2 + title: "Rolling the ladder up behind us" 3 + desc: | 4 + Who will take over for us if we don't train the next generation to replace us? A critique of craft, AI, and the legacy of human expertise. 5 + date: 2025-06-20 6 + hero: 7 + ai: "Photo by Xe Iaso, Canon EOS R6 Mark 2, unknown lens" 8 + file: "summer-walk" 9 + prompt: 10 + "A picture of two patches of wild grass bifurcated by a retaining pond." 11 + social: false 12 + --- 13 + 14 + import Conv from "../../_components/XeblogConv.tsx"; 15 + 16 + export const Aoi = ({ children, mood }) => ( 17 + <Conv name="Aoi" mood={mood}> 18 + {children} 19 + </Conv> 20 + ); 21 + 22 + export const Cadey = ({ children, mood }) => ( 23 + <Conv name="Cadey" mood={mood}> 24 + {children} 25 + </Conv> 26 + ); 27 + 28 + Cloth is one of the most important goods a society can produce. Clothing is 29 + instrumental for culture, expression, and for protecting one's modesty. 30 + Historically, cloth was one of the most expensive items on the market. People 31 + bought one or two outfits at most and then wore them repeatedly for the rest of 32 + their lives. Clothing was treasured and passed down between generations the same 33 + way we pass jewelry down between generations. This cloth was made in factories 34 + by highly skilled weavers. These weavers had done the equivalent of PhD studies 35 + in weaving cloth and used state of the art hardware to do it. 36 + 37 + As factories started to emerge, they were able to make cloth so much more 38 + cheaply than skilled weavers ever could thanks to inventions like the power 39 + loom. Power looms didn't require skilled workers operating them. You could even 40 + staff them with war orphans, which there was an abundance of thanks to all the 41 + wars. The quality of the cloth was absolutely terrible in comparison, but there 42 + was so much more of it made so much more quickly. This allowed the price of 43 + cloth to plummet, meaning that the wages that the artisans made fell from six 44 + shillings a day to six shillings per week over a period of time where the price 45 + of food doubled. 46 + 47 + Mind you, the weavers didn't just reject technological progress for the sake of 48 + rejecting it. They tried to work with the ownership class and their power looms 49 + in order to produce the same cloth faster and cheaper than they had before. For 50 + a time, it did work out, but the powers that be didn't want that. They wanted 51 + more money at any cost. 52 + 53 + At some point, someone had enough and decided to do something about it. Taking 54 + up the name Ned, he led a movement that resulted in riots, destroying factory 55 + equipment, and some got so bad they had to call the army in to break them up. 56 + Townspeople local to those factory towns were in full support of Ned's 57 + followers. Heck, even the soldiers sent to stop the riots ended up seeing the 58 + points behind what Ned's followers were doing and joined in themselves. 59 + 60 + The ownership class destroyed the livelihood of the skilled workers so that they 61 + could make untold sums of money producing terrible cloth that people would turn 62 + their one-time purchase of clothing into a de-facto subscription that they had 63 + to renew every time their clothing wore out. Now we have fast fashion and don't 64 + expect our clothing to last more than a few years. I have a hoodie from AWS 65 + Re:Invent in 2022 that I'm going to have to throw out and replace because the 66 + sleeves are dying. 67 + 68 + We only remember them as riots because their actions affected those in power. 69 + This movement was known as the Luddites, or the followers of Ned Ludd. The word 70 + "luddite" has since shifted meaning over time and is now understood as "someone 71 + who is against technological development". The Luddites were not against 72 + technology like the propaganda from the ownership class would have you expect, 73 + they fought against how it was implemented and the consequences of its rollout. 74 + They were skeptical that the shitty cloth that the power loom produced would be 75 + a net benefit to society because it meant that customers would inevitably have 76 + to buy their clothes over and over again, turning a one-time purchase into a 77 + subscription. Would that really benefit consumers or would that really benefit 78 + the owners of the factories? 79 + 80 + Nowadays the Heritage Crafts Association of the United Kingdom lists many forms 81 + of weaving as 82 + [Endangered or Critically Endangered crafts](https://www.heritagecrafts.org.uk/categories-of-risk/), 83 + meaning that those skills are either at critical risk of dying out without any 84 + "fresh blood" learning how to do it, or the last generation of artisans that 85 + know how to do that craft are no longer teaching new apprentices. All that 86 + remains of that expertise is now contained in the R\&D departments of the 87 + companies that produce the next generations of power looms, and whatever 88 + heritage crafts practitioners remain. 89 + 90 + Remember the Apollo program that let us travel to the moon? It was mostly 91 + powered by the Rocketdyne F1 engine. We have all of the technical specifications 92 + to build that rocket engine. We know all the parts you need, all the machining 93 + you have to do, and roughly understand how it would be done, but 94 + [we can't build another Rocketdyne F1](https://youtu.be/ovD0aLdRUs0) because all 95 + of the finesse that had been built up around manufacturing it no longer exists. 96 + Society has moved on and we don't have expertise in the tools that they used to 97 + make it happen. 98 + 99 + What are we losing in the process? We won't know until it's gone. 100 + 101 + ## We're going to run out of people with the word "Senior" in their title 102 + 103 + As I've worked through my career in computering, I've noticed a paradox that's 104 + made me uneasy and I haven't really been able to figure out why it keeps showing 105 + up: the industry only ever seems to want to hire people with the word Senior in 106 + their title. They almost never want to create people with the word Senior in 107 + their title. This is kinda concerning for me. People get old and no longer want 108 + to or are able to work. People get sick and become disabled. Accidental deaths 109 + happen and remove people from the workforce. 110 + 111 + <Picture 112 + path="blog/2025/rolling-ladder-behind-us/dog-meme" 113 + desc="A meme based on the format where the dog wants to fetch the ball but doesn't want to give the ball to the human to throw it, but with the text saying 'Senior?', 'Train Junior?', and 'No train junior, only hire senior'." 114 + /> 115 + 116 + If the industry at large isn't actively creating more people with the word 117 + Senior in their title, we are eventually going to run out of them. This is 118 + something that I want to address with Techaro at some point, but I'm not sure 119 + how to do that yet. I'll figure it out eventually. The non-conspiratorial angle 120 + for why this is happening is that money isn't free anymore and R&D salaries are 121 + no longer taxable business expenses in the US, so software jobs that don't 122 + "produce significant value" are more risky to the company. So of course they'd 123 + steal from the future to save today. Sounds familiar, doesn't it? 124 + 125 + <Cadey mood="coffee"> 126 + Is this how we end up losing the craft of making high quality code the same 127 + way we lost the craft of weaving high quality cloth? 128 + </Cadey> 129 + 130 + However there's another big trend in the industry that concerns me: companies 131 + releasing products that replace expertise with generative AI agents that just 132 + inscrutably do the thing for you. This started out innocently enough \- it was 133 + just better ways to fill in the blanks in your code. But this has ballooned and 134 + developed from better autocomplete to the point where you can 135 + [just assign issues to GitHub Copilot](https://github.com/orgs/community/discussions/159068) 136 + and have the issue magically get solved for you in a pull request. Ask the AI 137 + model for an essay and get a passable result in 15 minutes. 138 + 139 + At some level, this is really cool. Like, think about it. This reduces toil and 140 + drudgery to waiting for half an hour at most. In a better world I would really 141 + enjoy having a tool like this to help deal with the toil work that I need to do 142 + but don't really have the energy to. Do you know how many more of these essays 143 + would get finished if I could offload some of the drudgery of my writing process 144 + to a machine? 145 + 146 + We are not in such a better world. We are in a world where I get transphobic 147 + hate sent to the Techaro sales email. We are in a world where people like me are 148 + intentionally not making a lot of noise so that we can slide under the radar and 149 + avoid attention by those that would seek to destroy us. We are in a world where 150 + these AI tools are being pitched as the 151 + [next Industrial Revolution](https://www.salesforce.com/news/stories/agentic-ai-reshapes-workforce/), 152 + one where foisting our expertise away into language models is somehow being 153 + framed as a good thing for society. 154 + 155 + There's just one small problem: who is going to be paid and reap the benefits 156 + from this change as expectations from the ownership class change? A lot of the 157 + ownership class only really experiences the work product outputs of what we do 158 + with computers. They don't know the struggles involved with designing things 159 + such as 160 + [the user getting an email on their birthday](https://youtu.be/y8OnoxKotPQ). 161 + They don't want to get pushback on things being difficult or to hear that people 162 + want to improve the quality of the code. They want their sparkle emoji buttons 163 + to magically make the line go up and they want them yesterday. 164 + 165 + We deserve products that aren't cheaply made mass produced slop that 166 + incidentally does what people want instead of high quality products that are 167 + crafted to be exactly what people need, even if they don't know they need it. 168 + 169 + Additionally, if this is such a transformational technology, why are key figures 170 + promoting it by talking down to people? Why wouldn't they be using this to _lift 171 + people up_? 172 + 173 + <ConvP> 174 + <Aoi mood="wut"> 175 + Isn't that marketing? [Fear 176 + sells](https://www.google.com/url?q=https://www.latimes.com/business/technology/story/2023-03-31/column-afraid-of-ai-the-startups-selling-it-want-you-to-be&sa=D&source=docs&ust=1750357127980877&usg=AOvVaw2edAVveALA3bLpVpR_c8an) 177 + a lot better than hope ever will. Amygdala responses are pretty strong 178 + right? So aren't a lot of your fears of the technology really feeding into 179 + the hype and promoting the technology by accident? 180 + </Aoi> 181 + <Cadey mood="coffee"> 182 + I don't fear the power loom. I fear the profit expectations of the factory 183 + owners. 184 + </Cadey> 185 + </ConvP> 186 + 187 + ## Vibe coding is payday loans for technical debt 188 + 189 + As a technical educator, one of the things that I want to imprint onto people is 190 + that programming is a skill you can gain and that you too can both program 191 + things and learn how to program things. I want there to be more programmers out 192 + there. What I am about to say is not an attempt to gatekeep the skill and craft 193 + of computering; however, the ways that proponents of vibe coding are going about 194 + it are simply not the way forward to a sustainable future. 195 + 196 + About a year ago, Cognition teased an AI product named 197 + [Devin](https://devin.ai), a completely automated software engineer. You'd 198 + assign Devin tasks in Slack or Jira and then it would spin up a VM and plod its 199 + way through fixing whatever you asked it to. This demo deeply terrified me, as 200 + it was nearly identical to a story I wrote for the Techaro lore: 201 + [Protos](https://xeiaso.net/blog/protos/). The original source of that satire 202 + was experience working at a larger company that shall remain unnamed where the 203 + product team seemed to operate under the assumption that the development team 204 + had a secret "just implement that feature button" and that we as developers were 205 + working to go out of our way to **NOT** push it. 206 + 207 + Devin was that "implement that feature" button the same way Protos mythically 208 + did. From what I've seen with companies that actually use Devin, it's nowhere 209 + near actually being useful and usually needs a lot of hand-holding to do 210 + anything remotely complicated, thank God. 211 + 212 + The thing that really makes me worried is that the ownership class' expectations 213 + about the process of developing software are changing. People are being put on 214 + PIPs for not wanting to install Copilot. Deadlines come faster because 215 + ["the AI can write the code for you, right?"](https://www.reddit.com/r/ExperiencedDevs/comments/1kchah5/they_finally_started_tracking_our_usage_of_ai/) 216 + Twitter and Reddit contain myriads of stories of "idea guys" using Cursor or 217 + Windscribe to generate their dream app's backend and then making posts like 218 + "some users claim they can see other people's stuff, what kind of developer do I 219 + need to hire for this?" Follow-up posts include gems such as "lol why do coders 220 + charge so much???" 221 + 222 + By saving money in the short term by producing shitty software that doesn't 223 + last, are we actually spending more money over time re-buying nearly identical 224 + software after it evaporates from light use? This is the kind of thing that 225 + makes Canada not allow us to self-identify as Engineers, and I can't agree with 226 + their point more. 227 + 228 + ### Vibe Coding is just fancy UX 229 + 230 + Vibe coding is a distraction. It's a meme. It will come. It will go. Everyone 231 + will abandon the vibe coding tools eventually. My guess is that a lot of the 232 + startups propping up their vibe coding tools are trying to get people into 233 + monthly subscriptions as soon as possible so that they can mine passive income 234 + as their more casual users slowly give up on coding and just forget about the 235 + subscription. 236 + 237 + I'm not gonna lie though, the UX of vibe coding tools is top-notch. From a 238 + design standpoint it's aiming for that subtle brilliance where it seems to read 239 + your mind and then fill in the blanks you didn't even know you needed filled in. 240 + This is a huge part of how you can avoid the terror of the empty canvas. If you 241 + know what you are doing, an empty canvas represents infinite possibilities. 242 + There's nothing there to limit you from being able to do it. You have total 243 + power to shape everything. 244 + 245 + In my opinion, this is a really effective tool to help you get past that fear of 246 + having no ground to stand on. This helps you get past executive dysfunction and 247 + just ship things already. That part is a good thing. I genuinely want people to 248 + create more things with technology that are focused on the problems that they 249 + have. This is the core of how you learn to do new things. You solve small 250 + problems that can be applied to bigger circumstances. You gradually increase the 251 + scope of the problem as you solve individual parts of it. 252 + 253 + I want more people to be able to do software development. I think that it's a 254 + travesty that we don't have basic computer literacy classes in every stage of 255 + education so that people know how the machines **that control their lives** work 256 + and how to use them to their advantage. Sure it's not as dopaminergic as TikTok 257 + or other social media apps, but there's a unique sense of victory that you get 258 + when things just work. Sometimes that feeling you get when things Just Work™ is 259 + the main thing that keeps me going. Especially in anno dominium two thousand and 260 + twenty five. 261 + 262 + The main thing I'm afraid of is people becoming addicted to the vibe coding 263 + tools and letting their innate programming skills atrophy. I don't know how to 264 + suggest people combat this. I've been combating it by removing all of the 265 + automatic AI assistance from my editor (IE: I'll use a language server, but I 266 + won't have my editor do fill-in-the-middle autocomplete for me), but this isn't 267 + something that works for everyone. I've found myself more productive without it 268 + there and asking 269 + [a model for the missing square peg to round hole](https://chatgpt.com/share/685430cf-79ec-800c-9ea2-251301066f3d) 270 + when I inevitably need some toil code made. I ended up not shipping that due to 271 + other requirements, but you get what I'm going at. 272 + 273 + ### The "S" in MCP stands for Security 274 + 275 + The biggest arguments I have against vibe coding and all of the tools behind it 276 + boil down to one major point: these tools have a security 277 + [foundation of sand](https://www.biblegateway.com/passage/?search=matthew%207:24-27&version=NIV). 278 + Most of the time when you install and configure a 279 + [Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction) 280 + server, you add some information to a JSON file that your editor uses to know 281 + what tools it can dispatch with all of your configuration and API tokens. These 282 + MCP servers run as normal OS processes with absolutely no limit to what they can 283 + do. They can easily delete all files on your system, install malware into your 284 + autostart, or exfiltrate all your secrets without any oversight. 285 + 286 + Oh, by the way, that whole "it's all in one JSON file with all your secrets" 287 + problem? That's now seen as a load-bearing feature so that scripts can 288 + automatically install MCP servers for you. You don't even need to get expertise 289 + in how the tools work\! There's a MCP server installer MCP server so that you 290 + can say "Hey torment nexus, install GitHub integration for me please" and then 291 + it'll just do it with no human oversight or review on what you're actually 292 + installing. Seems safe to me\! What could possibly go wrong? 293 + 294 + If this is seriously the future of our industry, I wish that the people involved 295 + would take one trillionth of an iota of care about the security of the 296 + implementation. This is the poster child for something like the 297 + [WebAssembly Component Model](https://component-model.bytecodealliance.org/). 298 + This would let you define your MCP servers with strongly typed interfaces to the 299 + outside world that can be granted or denied permissions by users with strong 300 + capabilities. Combined with the concept of 301 + [server resources](https://modelcontextprotocol.io/specification/2025-06-18/server/resources), 302 + this could let you expand functionality however you wanted. Running in 303 + WebAssembly means that the no MCP server can just read `~/.ssh/id_ed25519` and 304 + exfiltrate your SSH key. Running in WebAssembly means that it can't just connect 305 + to `probably-not-malware.lol` and then evaluate JavaScript code with user-level 306 + permissions on the fly. We shouldn't have to be telling developers "oh just run 307 + it all in Docker". We should have designed this to be fundamentally secure from 308 + the get-go. Personally, I only run MCP ecosystem things when contractually 309 + required to. Even then, I run it in a virtual machine that I've already marked 310 + as known compromised and use separate credentials not tied to me. Do with this 311 + information as you will. 312 + 313 + I had a lot of respect for Anthropic before they released this feculent bile 314 + that is the Model Context Protocol spec and initial implementations to the 315 + public. It just feels so half-baked and barely functional. Sure I don't think 316 + they expected it to become the Next Big Meme™, but I thought they were trying to 317 + do things ethically above board. Everything I had seen from Anthropic before had 318 + such a high level of craft and quality, and this was such a huge standout. 319 + 320 + We shouldn't have to be placing fundamental concerns like secret management or 321 + sandboxing as hand-waves to be done opt-in by the user. They're not gonna do it, 322 + and we're going to have 323 + [more incidents where Cursor goes rogue and nukes your home folder](https://forum.cursor.com/t/cursor-yolo-deleted-everything-in-my-computer/103131) 324 + until someone cares enough about the craft of the industry to do it the right 325 + way. 326 + 327 + ## Everyone suffers so the few can gain 328 + 329 + I have a unique view into a lot of the impact that AI companies have had across 330 + society. I'm the CEO of [Techaro](https://techaro.lol), a small one-person 331 + startup that develops [Anubis](https://anubis.techaro.lol), a Web AI Firewall 332 + Utility that helps mitigate the load of automated mass scraping so that open 333 + source infrastructure can stay online. I've had sales calls with libraries and 334 + universities that are just being _swamped_ by the load. There's stories of 335 + GitLab servers eating up _64 cores_ of high-wattage server hardware due to all 336 + of the repeated scraping over and over in a loop. I swear a lot of this scraping 337 + has to be some kind of dataset arbitrage or something, that's the only thing 338 + that makes sense at this point. 339 + 340 + And then in the news the AI companies claim "oh no we're just poor little 341 + victorian era orphans, we can't possibly afford to fairly compensate the people 342 + that made the things that make our generative AI models as great as they are". 343 + When the US copyright office 344 + [tried to make AI training not a fair use](https://www.forbes.com/sites/torconstantino/2025/05/29/us-copyright-office-shocks-big-tech-with-ai-fair-use-rebuke/), 345 + the head of that office suddenly found themselves jobless. Why must these 346 + companies be allowed to take everything without recourse or payment to the 347 + people that created the works that fundamentally power the models? 348 + 349 + The actual answer to this is going to sound a bit out there, but stay with me: 350 + they believe that we're on the verge of creating artificial superintelligence; 351 + something that will be such a benevolent force of good that any strife in the 352 + short term will ultimately be cancelled out by the good that is created as a 353 + result. These people unironically believe that a machine god will arise and we'd 354 + be able to delegate all of our human problems to it and we'll all be fine 355 + forever. All under the thumb of the people that bought the GPUs with dollars to 356 + run that machine god. 357 + 358 + As someone that grew up in a repressed environment full of evangelical 359 + christianity, I recognize this story instantly: it's the second coming of Christ 360 + wrapped in technology. Whenever I ask the true believers entirely sensible 361 + questions like "but if you can buy GPUs with dollars, doesn't that mean that 362 + whoever controls the artificial superintelligence thus controls everyone, even 363 + if the AI is fundamentally benevolent?" The responses I get are illuminating. 364 + They sound like the kinds of responses that evangelicals give when you question 365 + their faith. 366 + 367 + ### Artists suffer first 368 + 369 + Honestly though, the biggest impact I've seen across my friends has been what's 370 + happened to art commissions. I'm using these as an indicator for how the 371 + programming industry is going to trend. Software development is an art in the 372 + same vein as visual/creative arts, but a lot of the craft and process that goes 373 + into visual art is harder to notice because it gets presented as a flat 374 + single-dimensional medium. 375 + 376 + Sometimes it can take days to get something right for a drawing. But most of the 377 + time people just see the results of the work, not the process that goes into it. 378 + This makes things like prompting "draw my Final Fantasy 14 character in Breath 379 + of the Wild" with images as references and getting a result in seconds look more 380 + impressive. If you commissioned a human to get a painting like this: 381 + 382 + <Picture 383 + path="blog/2025/rolling-ladder-behind-us/zamqo-botw" 384 + desc="An AI-generated illustration of my Final Fantasy 14 character composited into a screenshot of Breath of the Wild. Generated by GPT-4o through the ChatGPT interface. Inputs were a screenshot of Breath of the Wild and reference photos of my character." 385 + /> 386 + 387 + It'd probably take at least a week or two as the artist worked through their 388 + commission queue and sent you in-progress works before they got the final 389 + results. By my estimates between the artists I prefer commissioning, this would 390 + cost somewhere between 150 USD and 500 EUR at minimum. Probably more when you 391 + account for delays in the artistic process and making sure the artist is 392 + properly paid for their time. It'd be a masterpiece that I'd probably get 393 + printed and framed, but it would take a nonzero amount of time. 394 + 395 + If you only really enjoy the products of work and don't understand/respect any 396 + of the craftsmanship that goes into making it happen, you'd probably be okay 397 + with that instantly generated result. Sure the sun position in that image 398 + doesn't make sense, the fingers have weird definition, her tail is the wrong 399 + shape, it pokes out of the dress in a nonsensical way (to be fair, the reference 400 + photos have that too), the dress has nonsensical shading, and the layering of 401 + the armor isn't like the reference pictures, but you got the result in a 402 + minute\! 403 + 404 + A friend of mine runs [an image board for furry art](https://furbooru.org/). He 405 + thought that people would use generative AI tools as a part of their workflows 406 + to make better works of art faster. He was wrong, it just led to people flooding 407 + the site with the results of "wolf girl with absolutely massive milkers showing 408 + her feet paws" from their favourite image generation tool in every fur color 409 + imaginable, then with different characters, then with different anatomical 410 + features. There was no artistic direction or study there. Just an endless flood 411 + of slop that was passable at best. 412 + 413 + Sure, you can make high quality art with generative AI. There's several comic 414 + series where things are incredibly temporally consistent because the artist 415 + trained their own models and took the time to genuinely gain expertise with the 416 + tools. They filter out the hallucination marks. They take the time to use it as 417 + a tool to accelerate their work instead of replacing their work. The boards they 418 + post it to go out of their way to excise the endless flood of slop and by 419 + controlling how the tools work they actually get a better result than they got 420 + by hand, much like how the skilled weavers were able to produce high quality 421 + cloth faster and cheaper with the power looms. 422 + 423 + We are at the point where the artists want to go and destroy the generative 424 + image power looms. Sadly, they can't even though they desperately want to. These 425 + looms are locked in datacentres that are biometrically authenticated. All human 426 + interaction is done by a small set of trusted staff or done remotely by true 427 + believers. 428 + 429 + I'm afraid of this kind of thing happening to the programming industry. A lot of 430 + what I'm seeing with vibe coding leading to short term gains at the cost of long 431 + term toil is lining up with this. Sure you get a decent result now, but 432 + long-term you have to go back and revise the work. This is a great deal if you 433 + are producing the software though; because that means you have turned one-time 434 + purchases into repeat customers as the shitty software you sold them inevitably 435 + breaks, forcing the customer to purchase fixes. The one-time purchase inevitably 436 + becomes a subscription. 437 + 438 + We deserve more in our lives than good enough. 439 + 440 + ## Stop it with the sparkle emoji buttons 441 + 442 + Look, CEOs, I'm one of you so I get it. We've seen the data teams suck up 443 + billions for decades and this is the only time that they can look like they're 444 + making a huge return on the investment. Cut it out with shoving the sparkle 445 + emoji buttons in my face. If the AI-aided product flows are _so good_ then the 446 + fact that they are using generative artificial intelligence should be 447 + _irrelevant_. You should be able to replace generative artificial intelligence 448 + with another technology and then the product will still be as great as it was 449 + before. 450 + 451 + When I pick up my phone and try to contact someone I care about, I want to know 452 + that I am communicating with them and not a simulacrum of them. I can't have 453 + that same feeling anymore due to the fact that people that don't natively speak 454 + English are much more likely to filter things through ChatGPT to "sound 455 + professional". 456 + 457 + I want your bad English. I want your bad art. I want to see the raw unfiltered 458 + expressions of humanity. I want to see your soul in action. I want to 459 + communicate with you, not a simulacrum that stochastically behaves like you 460 + would by accident. 461 + 462 + And if I want to use an LLM, I'll use an LLM. Now go away with your sparkle 463 + emoji buttons and stop changing their CSS class names so that my uBlock filters 464 + keep working. 465 + 466 + ## The human cost 467 + 468 + This year has been a year full of despair and hurt for me and those close to me. 469 + I'm currently afraid to travel to the country I have citizenship in because the 470 + border police are run under a regime that is dead set on either elimination or 471 + legislating us out of existence. In this age of generative AI, I just feel so 472 + replaceable at my dayjob. My main work product is writing text that convinces 473 + people to use globally distributed object storage in a market where people don't 474 + realize that's something they actually need. Sure, this means that my path 475 + forward is simple: show them what they're missing out on. But I am just so 476 + tired. I hate this feeling of utter replaceability because you can get 80% as 477 + good of a result that I can produce with a single invocation of OpenAI's Deep 478 + Research. 479 + 480 + Recently a decree came from above: our docs and blogposts need to be optimized 481 + for AI models as well as humans. I have domain expertise in generative AI, I 482 + know exactly how to write SEO tables and other things that the AI models can 483 + hook into seamlessly. The language that you have to use for that is nearly 484 + identical to what the cult leader used that one time I was roped into a cult. Is 485 + that really the future of marketing? Cult programming? I don't want this to be 486 + the case, but when you look out at everything out there, you can't help but see 487 + the signs. 488 + 489 + Aspirationally, I write for humans. Mostly I write for the version of myself 490 + that was struggling a decade ago, unable to get or retain employment. I create 491 + things to create the environment where there are more like me, and I can't do 492 + that if I'm selling to soulless automatons instead of humans. If the artificial 493 + intelligence tools were…well…intelligent, they should be able to derive meaning 494 + from unaltered writing instead of me having to change how I write to make them 495 + hook better into it. If the biggest thing they're sold for is summarizing text 496 + and they can't even do that without author cooperation, what are we doing as a 497 + society? 498 + 499 + Actually, what are we going to do when everyone that cares about the craft of 500 + software ages out, burns out, or escapes the industry because of the ownership 501 + class setting unrealistic expectations on people? Are the burnt out developers 502 + just going to stop teaching people the right ways to make software? Is society 503 + as a whole going to be _right_ when they look back on the good old days and 504 + think that software used to be more reliable? 505 + 506 + ## The Butlerians had a point 507 + 508 + Frank Herbert's Dune world had superintelligent machines at one point. It led to 509 + a galactic war and humanity barely survived. As a result, all thinking machines 510 + were banned, humanity was set back technologically, and a rule was created: Thou 511 + shalt not make a machine in the likeness of a human mind. For a very long time, 512 + I thought this was very strange. After all, in a fantasy scifi world like Dune, 513 + thinking machines could automate so much toil that humans had to process. They 514 + had entire subspecies of humans that were functionally supercomputers with 515 + feelings that were used to calculate the impossibly complicated stellar draft 516 + equations so that faster-than-light travel didn't result in the ship zipping 517 + into a black hole, star, moon, asteroid, or planet. 518 + 519 + After seeing a lot of the impact across humanity in later 2024 and into 2025, I 520 + completely understand the point that Frank Herbert had. It makes me wish that I 521 + could leave this industry, but this is the only thing that pays enough for me to 522 + afford life in a world where my husband gets casually laid off after being at 523 + the same company for six and a half years because some number in a spreadsheet 524 + put him on the shitlist. Food and rent keeps going up here, but wages don't. I'm 525 + incredibly privileged to be able to work in this industry as it is (I make 526 + enough to survive, don't worry), but I'm afraid that we're rolling the ladder up 527 + behind us so that future generations won't be able to get off the ground. 528 + 529 + Maybe the problem isn't the AI tools, but the way they are deployed, who 530 + benefits from them, and what those benefits really are. Maybe the problem isn't 531 + the rampant scraping, but the culture of taking without giving anything back 532 + that ends up with groups providing critical infrastructure like FFmpeg, GNOME, 533 + Gitea, FreeBSD, NetBSD, and the United Nations having to resort to increasingly 534 + desperate measures to maintain uptime. 535 + 536 + Maybe the problem really is winner-take-all capitalism. 537 + 538 + --- 539 + 540 + The deployment of generative artificial intelligence tools has been a disaster 541 + for the human race. They have allowed a select few to gain "higher 542 + productivity"; but they have destabilized society, have made work transactional, 543 + have subjected artists to indignities, have led to widespread psychological 544 + suffering for the hackers that build the tools AI companies rely on, and inflict 545 + severe damage on the natural world. The continued development of this technology 546 + will worsen this situation. It will certainly subject human beings to greater 547 + indignities and inflict greater damage on the natural world, it will probably 548 + lead to greater social disruption and psychological suffering, and it may lead 549 + to increased physical suffering even in "advanced" countries. 550 + 551 + --- 552 + 553 + For other works in a similar vein, read these: 554 + 555 + - [Building a Healthy Relationship with AI \- A Cross-Disciplinary Perspective](https://nombiezinja.com/word-things/2025/6/18/building-a-healthy-relationship-with-ai-a-cross-disciplinary-perspective) 556 + - [Contra Ptacek's Terrible Article On AI](https://ludic.mataroa.blog/blog/contra-ptaceks-terrible-article-on-ai/) 557 + 558 + Special thanks to the following people that read and reviewed this before 559 + release: 560 + 561 + - Ti Zhang 562 + - Annie Sexton 563 + - Open Skies 564 + - Nina Vyedin 565 + - Eric Chlebek 566 + - Ahroozle REDACTED 567 + - Kronkleberry 568 + - CELPHASE
+591
.claude/skills/xe-writing-style/assets/squandered-holy-grail.mdx
··· 1 + --- 2 + title: "They squandered the holy grail" 3 + desc: 4 + "Why Apple Intelligence failed even though everything it's built upon is 5 + nearly perfect" 6 + date: 2025-01-06 7 + hero: 8 + ai: "Photo by Suliman Sallehi, found on Pexels" 9 + file: "shaka-walls-fell" 10 + prompt: "A crumbling ruin of a once-mighty building on a hill in Afganistan" 11 + social: false 12 + --- 13 + 14 + A while ago, I got really frustrated at my Samsung S7. It was failing to hold a 15 + battery charge, or having issues with the Wi-Fi, or DNS over LTE or something 16 + and I reached a breaking point where I bussed over to Bellevue Square and bought 17 + an iPhone 7. It was my first Apple product that I'd ever bought with my own 18 + money and my first non-Android phone since I used Windows Mobile 6 on a T-Mobile 19 + Dash in high school. 20 + 21 + Needless to say, I loved it at first sight and all my phones since have been 22 + iPhones. The camera is good enough that I have to go out of my way to make my 23 + actual cameras different from what you can get on an iPhone. Hell, the iPhone is 24 + a fully capable cinema camera these days. It's easily been one of the best 25 + technology moves I've ever done for my creative career. The device enables me to 26 + do things and create memories of them to share with others. 27 + 28 + ## Bicycles for the mind 29 + 30 + Way back in 1981, Steve Jobs (one of the co-founders of Apple) described the 31 + vision of Apple computers like this: 32 + 33 + <blockquote> 34 + I read a study that measured the efficiency of locomotion for various species 35 + on the planet. The condor used the least energy to move a kilometer. And, 36 + humans came in with a rather unimpressive showing, about a third of the way 37 + down the list. [...] But, then somebody at Scientific American had the insight 38 + to test the efficiency of locomotion for a man on a bicycle. And [...] a human 39 + on a bicycle, blew the condor away. [A computer is] the most remarkable tool 40 + that we’ve ever come up with, and it’s the equivalent of a bicycle for our 41 + minds. 42 + </blockquote> 43 + \-[Steve 44 + Jobs](https://www.goodreads.com/quotes/9281634-i-think-one-of-the-things-that-really-separates-us) 45 + 46 + Apple computers aim to make it easier for people to be creative while spending 47 + less energy to do it. One of the big things that Apple made the Macintosh for 48 + was typography. With that, they made 49 + [MacWrite](https://en.wikipedia.org/wiki/MacWrite), one of the two programs that 50 + shipped with every Macintosh computer for free. If you were used to having to 51 + write documents out by hand or using a typewriter to make them, the leap to 52 + something like a word processor is so amazingly vast that it's difficult for 53 + anyone younger than me to comprehend it. We've had them all our lives. 54 + 55 + <Picture 56 + path="blog/2025/squandered-holy-grail/macwrite" 57 + desc="A screenshot of an emulated Macintosh running MacWrite with the first paragraph of The Bee Movie script in it." 58 + /> 59 + 60 + Imagine not being able to reliably use the backspace key when you're writing 61 + something. Imagine a world where all you could do was just write more text. 62 + Sure, there were ways to "cover up" a mis-typed letter, but they were vastly 63 + more inconvenient than just ignoring it or re-typing the word and crossing the 64 + wrong one out by hand. 65 + 66 + Word processors let you use the backspace key to delete text and then look at 67 + the screen to get a reasonable approximation of what the printed document would 68 + look like. Before you print it. 69 + 70 + To say that this enables a vastly different kind of creative process is like 71 + saying that water makes things damp. Word processors like MacWrite absolutely 72 + transformed the ways that everyone used computers. They were bicycles for the 73 + mind and without them our world would be starkly different. I shudder to imagine 74 + that [NaNoWriMo](https://nanowrimo.org/) would be a thing without word 75 + processors. 76 + 77 + Many companies want to make computers that you can use to do computer things. 78 + Apple makes tools that you use as an extension of your body in order to do 79 + creative things. They don't just sell computers, they sell something that helps 80 + enable you to create things that just so happen to be computers. 81 + 82 + This is the big vision difference that puts Apple in its own class. They sell 83 + bicycles for the mind. 84 + 85 + ## Intelligence as a faucet 86 + 87 + In June 2024, Apple announced 88 + [Apple Intelligence](https://www.apple.com/apple-intelligence/): a set of 89 + features aimed at making your smartphone smart. The biggest thing that stood out 90 + to me was this example of what Apple Intelligence was going to enable in Siri: 91 + 92 + <Picture 93 + path="blog/2025/squandered-holy-grail/podcast-other-day" 94 + desc="An Apple Keynote slide saying 'Play that podcast my wife sent the other day'." 95 + /> 96 + 97 + If they could really just correlate relationships, categorize links, and make 98 + all of that context visible to Siri, that would be fundamentally transformative 99 + in ways that the word processor would be to people that didn't have it before. 100 + Everything else done with it would be added bonuses or party tricks on the side. 101 + The real benefit would be being able to search through all of your digital life 102 + across every app with simple queries and then have your phone do things for you. 103 + 104 + Sure Craig's example was playing a podcast, but the basic idea holds for other 105 + types of media too. "Share those pics from San Francisco to Instagram." All this 106 + context that everything is building up would finally be useful to the users 107 + instead of just useful to the companies making all of the apps we use. 108 + 109 + They wanted to make all Apple devices be able to tap into _intelligence_ as a 110 + faucet in the same way that Spotify lets you tap into music as a faucet and that 111 + the AWS API lets you tap into compute as a faucet. This is massive and if it was 112 + pulled off right would be the new standard that companies like Samsung and 113 + Google would clone the same way they cloned the hardware and software design of 114 + the iPhone. 115 + 116 + In that keynote, they spilled out the vision that computers should work _with_ 117 + you in order for you to do what you want to do. They should enable you to be 118 + creative. They should be bicycles for the mind. The fact that they are computers 119 + should only be a footnote in an appendex titled "implementation details". 120 + 121 + Then they casually dropped the holy grail of trusted compute, but in order to 122 + understand why it's so big we need to take a little detour into the modern 123 + Internet user's view of the Internet. 124 + 125 + ## Apps as thin as reception 126 + 127 + One of the biggest problems with modern applications is that they are thin 128 + shells around web services. When you open the Instagram or Bluesky apps, your 129 + phone makes a request to their servers and then shows you posts when it gets a 130 + response. You don't know or care how those responses are getting made, you just 131 + know that when you open the app, you get content and that makes you happy. 132 + 133 + However, when you don't have signal, you don't have the app. Go on an airplane 134 + and once you run out of reception the app is worthless. You can't queue posts to 135 + be made when you get back into signal. You can't view posts that were available 136 + before you lost connection. You can't even view things that you just posted in 137 + some cases. The app breaks and you are slowly alienated from your data one photo 138 + at a time. 139 + 140 + This is the way nearly every single app on my phone works with only two 141 + exceptions: Signal and everything Apple makes. If you want to read more about 142 + how the modern user's experience with the Internet is like, check out Ed 143 + Zitron's [Never Forgive Them](https://www.wheresyoured.at/never-forgive-them/). 144 + 145 + <blockquote> 146 + As every single platform we use is desperate to juice growth from every user, 147 + everything we interact with is hyper-monetized through plugins, advertising, 148 + microtransactions and other things that constantly gnaw at the user 149 + experience. We load websites expecting them to be broken, especially on 150 + mobile, because every single website has to have 15+ different ad trackers, 151 + video ads that cover large chunks of the screen, all while demanding our email 152 + or for us to let them send us notifications. 153 + </blockquote> 154 + \-[Ed Zitron](https://www.wheresyoured.at/never-forgive-them/) 155 + 156 + Not to mention, you don't know how the services that power your apps work. The 157 + market at large does not want to pay for chat programs or social media. Running 158 + chat programs and social media apps is mind-bogglingly expensive. Venture 159 + capital only lasts so long and the companies involved have to make money 160 + somehow. The big pile of user data starts to look like a really good thing to 161 + mine in order to make a profit. 162 + 163 + ## The holy grail of trusted compute 164 + 165 + This stands in stark contrast to the goal of something like Apple Intelligence. 166 + When possible, Apple Intelligence will run on your device. Apple 167 + [went out of their way](https://arxiv.org/abs/2312.11514) to make it possible 168 + and easy to run large language models and other AI models on your device without 169 + having to make too many compromises in the process. If something is done on your 170 + device (or at least on hardware that you can look at, like a Mac mini in your 171 + office), then the computation is _infinitely_ more private than anything 172 + involving making a request to the outside world. 173 + 174 + In Apple's WWDC keynote they claimed that they had a system called Private Cloud 175 + Compute that would enable users to have the same privacy guarantees (or more) 176 + when making requests out over the network as they did for computations running 177 + on their local devices. 178 + 179 + This seemed impossible to me. From what I know about how the web service sausage 180 + is made, it seems impossible to have all of these guarantees at the same time: 181 + 182 + - User data is only used to fulfill requests and then erased. 183 + - The load balancing infrastructure doesn't know who is making a request and 184 + what server it is going to. 185 + - Researchers are able to inspect and verify the Private Cloud Compute system 186 + and simulate it on their laptops. 187 + - Apple site reliability staff does not have privileged access to Private Cloud 188 + Compute nodes and logging is minimized at the compiler level. 189 + - An attacker cannot reliably figure out which node is being used to make any 190 + request from any user. 191 + 192 + If you have any modicum of site reliability experience, this seems like an 193 + unsatisfiable set of constraints. It seems literally impossible, yet here they 194 + are claiming that they have done it. 195 + 196 + The 197 + [technical details](https://security.apple.com/documentation/private-cloud-compute) 198 + of how they pulled this off is well worth reading, if only because it is the 199 + first time I have ever seen any company's AI product team put together a cogent 200 + security model and release that security model to the public. TL;DR: 201 + 202 + - They X-ray the hardware at every step of the assembly process and compare that 203 + to reference images in order to combat threats from factory workers adding 204 + unapproved hardware to the boards of the servers. 205 + - You can set up your own local copy of a Private Cloud Compute node and punish 206 + it with all the hellfire you want to see if you can break it and get root. 207 + Apple will pay you a lot of money if you can. 208 + - The hardware certification process involves a lot of unrelated people in 209 + unrelated departments of Apple. 210 + - Every Private Cloud Compute node is rigged to not only decertify itself when 211 + power is removed, they also rigged the main power for the board to the chassis 212 + intrusion switch. Open the server? Power gets cut and the node is 213 + de-certified. 214 + - Every time your devices make a request to Private Cloud Compute, they record 215 + the node ID that was used to fulfill it and you can go in and verify that all 216 + of the nodes your device used are still certified. 217 + - The production OS images are free for the public to download and not encrypted 218 + in any way. 219 + - Every package that makes up the important parts of the OS are split into two 220 + types: code and data. You cannot mix code into a data package or vice-versa. 221 + 222 + This is literal madness in comparison to how most other AI products are run. 223 + Most of the time, an AI product is run on some GPUs you got somewhere that run 224 + some firmware that you probably haven't tested or verified (even though everyone 225 + with access to the GPU can reflash the firmware from software), with 226 + bog-standard nginx or something choosing to route your requests to a service 227 + running somewhere without any real guarantees that the service is not logging 228 + and storing literally everything you put into it. From a user privacy 229 + standpoint, it's basically the same as using Instagram. You assume that 230 + everything is being logged and used to make money somehow. 231 + 232 + Apple is standing in stark opposition to this and saying "no, we ain't doing 233 + that" and then backing it all up with code as well as detailed documentation for 234 + how they pulled it all off. They also released the source code for the 235 + security-critical parts of Private Cloud Compute 236 + [openly on GitHub](https://github.com/apple/security-pcc). 237 + 238 + This is the holy grail for remotely attested trusted compute. This OS is the 239 + kind of thing that Richard Stallman was warning about in 240 + [The Right to Read](https://www.gnu.org/philosophy/right-to-read.en.html). You 241 + don't get root there. You don't get a compiler. You don't get a debugger. You 242 + don't get anything but the ability to run software that was shipped with the OS 243 + image. If this OS were shipped to consumers, you would have a nearly unhackable 244 + system that would make it basically impossible to tinker with. There are many 245 + reasons why you would want such a thing in 246 + [the era of phone scamming the elderly](https://www.youtube.com/watch?v=dWzz3NeDz3E), 247 + but it would make it difficult for people like me to be developed with it. 248 + 249 + However, for something like Private Cloud Compute, it's a perfect match. All the 250 + computer can do is known in advance and nothing else is allowed to happen. This 251 + makes it a lot easier to ensure that privacy guarantees are that: guarantees. 252 + 253 + It's really frustrating that this foundation of trusted compute is being 254 + squandered. I wish I had an OS like Private Cloud Compute's as an option for 255 + building production systems. 256 + 257 + ## What we got 258 + 259 + We got the first batch of Apple Intelligence features at the end of 260 + October 2024. They've been advertised as if they are all out. With that we got 261 + Writing Tools to help you summarize and rewrite text; summaries for 262 + notifications, webpages, and emails; Clean Up in case you want to remove things 263 + from photos; the ability to search for photos based on their contents; Siri 264 + being able to search through the documentation for your device; and Math Notes 265 + to let you solve equations in the Notes app. Later we got Image Playground and 266 + email categorization. That mythical personal context is omnipresent in the 267 + advertising yet somehow, it's not launched yet. 268 + 269 + I'm gonna break down my feelings about each of these features in their own 270 + little sections after having used them somewhat extensively. 271 + 272 + ### Math Notes 273 + 274 + I just want to start out by saying that out of all of these features the one 275 + that I love the most is Math Notes. Holy crap, Math Notes is incredible. It lets 276 + you type out things like this: 277 + 278 + ```bash 279 + Rent = 2300 280 + FamilySize = 2 281 + Rent / FamilySize = 282 + ``` 283 + 284 + And then the Notes app will just insert `1150` after that last equals sign. It's 285 + fantastic. I end up using this to do basic calculations with variables including 286 + terrible estimates for how much money I'm making vs spending. It's probably one 287 + of the best features ever made for any Apple device ever. 288 + 289 + I have zero complaints about Math Notes. 290 + 291 + ### Writing Tools 292 + 293 + I'm not the target audience for Writing Tools. I've written a bunch of articles 294 + on my blog and frankly, I don't really need help from an AI model in order to 295 + write; I have a bigger problem not writing than I do writing. With that in mind, 296 + Writing Tools is basically useless for me. It's just a way to get a slightly 297 + worse version of what I already have the ability to make myself better. 298 + 299 + Using Writing Tools makes me feel the worst side effects of existing AI tools 300 + and how they replace the creative process instead of adding to it. When I put 301 + text through Writing Tools I get an opaque answer. I don't get layers into the 302 + answer or steerability beyond "make this professional" or "turn this into a 303 + table", I just get a blob of text that can be painful to read. 304 + 305 + For example, I asked Writing Tools to summarize that last paragraph and here's 306 + what I got: 307 + 308 + <Picture 309 + path="blog/2025/squandered-holy-grail/writing-tools-fail" 310 + desc="a modal dialog over the previous paragraph in Apple Notes titled 'Writing Tools Unavailable: Certain capabilities are unavailable at this time. Try again later.'" 311 + /> 312 + 313 + You can't make this up. I asked it to make it friendlier and I got this: 314 + 315 + > Using Writing Tools has some serious drawbacks. It feels like it’s taking away 316 + > the creative process instead of enhancing it. When I input text into Writing 317 + > Tools, I get a generic response that lacks depth and customization options. 318 + > It’s like getting a one-size-fits-all answer that’s hard to follow and doesn’t 319 + > allow for much creativity. 320 + 321 + It can be useful for taking a bunch of ranting I do on stream and turning that 322 + into an outline that I can use for a starting point in writing, but it does a 323 + terrible job of doing the writing for me. I imagine that for other people that 324 + don't have extensive English experience it'd be a lot more useful, but I don't 325 + know how useful it is for me. 326 + 327 + ### Notification, Message, and Email summaries 328 + 329 + This is the biggest feature that sounds like a good idea until you actually 330 + implement it. The core idea is that when you get a bunch of notifications from 331 + your apps, you have a stack of things that can be tedious to go through. A 332 + summary is easier to digest and gets the point across much easier. 333 + 334 + This works great until it doesn't. Here's the summary of a scam text message I 335 + got as I was writing this post: 336 + 337 + <Picture 338 + path="blog/2025/squandered-holy-grail/sms-scam-fail" 339 + desc="A message summary: 'Package delivery delayed due to incomplete address information...'" 340 + /> 341 + 342 + This phrases a _literal scam message_ in ways that make me think immediate 343 + action is required. You can see how this doesn't scale, right? It's gotten to 344 + the point where the news has reported on how notification summaries 345 + [made people think a suspect in custody killed themselves](https://www.bbc.com/news/articles/cd0elzk24dno). 346 + 347 + Even more, if you have Apple Intelligence enabled for some of the other features 348 + but disable notification summaries because you find them worthless, you can get 349 + your notifications delayed up to _five seconds_. It's kind of depressing that 350 + telling your computer to do _less work_ makes the result take longer than doing 351 + _more work_. 352 + 353 + Additionally, none of the summarization features work on my iPhone and I can't 354 + be bothered to figure out why and fix it. I personally don't find them useful. I 355 + just leave them enabled on my MacBook so that notification delivery is not 356 + impacted. 357 + 358 + <Conv name="Cadey" mood="percussive-maintenance"> 359 + Even though it has decent "Apple polish", it just feels half-baked somehow. 360 + It's almost like it's not done yet but they were made to just ship whatever 361 + they had in order to meet some arbitrary deadline made up by someone that 362 + doesn't understand the details. This feels like it's happening across the 363 + industry though, especially as companies try to milk the money generator for 364 + more money. 365 + </Conv> 366 + 367 + ### Clean Up 368 + 369 + I don't like Clean Up from a philosophical standpoint. I'm a photographer. When 370 + I frame a shot and take it, I want the data coming off of the sensor to be the 371 + data that makes up the image. I want to avoid as much processing as possible and 372 + I want the photo to be a reflection of reality as it is, not reality as it 373 + should have been. Sure, sometimes I'll do some color correction or cropping in 374 + post, but that doesn't change the _content_ of the image, only its presentation. 375 + 376 + Clean Up is best explained by this famous photo editing example: 377 + 378 + <Picture 379 + path="blog/2025/squandered-holy-grail/stalin-photo" 380 + desc="A picture of Joseph Stalin, former Prime Minister of the Soviet Union next to Nikolai Yezhov before and after being removed from Soviet history after being purged." 381 + /> 382 + 383 + This tool allows you to capture a moment in time as you wish it happened, not as 384 + it actually happened. I don't like this from a philosophical standpoint. I'd 385 + much rather capture things as they were. As such, I haven't used Clean Up and 386 + can't talk about it much more. 387 + 388 + ### Image Playground 389 + 390 + I have a lot of thoughts about Image Playground. I've used a lot of image 391 + generation models and I'm currently working on experiments with conveyance 392 + (images that convey feelings or moods that would take many words to explain) in 393 + generative AI. Here's one of my successful examples: 394 + 395 + <Picture 396 + path="blog/2025/squandered-holy-grail/sakura-flower-field" 397 + desc="A picture of a brown-haired anime woman smiling in a field of blooming pink flowers, heavy depth of field so only the woman and a couple of flowers are in focus. Made with Stable Diffusion 1.5 and ComfyUI." 398 + /> 399 + 400 + I made this using a stack of about 11 to 12 models in a complex diffusion flow 401 + using a Stable Diffusion 1.5 finetune from late 2022. Let's call this the upper 402 + bound of how good you can get outputs from techniques of the era. There's some 403 + glaring flaws (mostly involving the continuity of the fence, but that could be 404 + explained with fence construction methods). 405 + 406 + In comparison, here's one of my Image Playground generations of the East Berlin 407 + TV tower at sunset: 408 + 409 + <Picture 410 + path="blog/2025/squandered-holy-grail/berlin-tv-tower" 411 + desc="An AI-generated illustration of the East Berlin TV tower at sunset." 412 + /> 413 + 414 + This is also pretty good, there's problems with how the sky is half mid-day and 415 + half sunset and the windows/decks have a lot of issues with many straight lines, 416 + but it'd be mostly passable at a casual glance. Especially on a phone screen. 417 + I'm able to see a lot more of the flaws due to my extensive experience with AI 418 + tools, but in a pinch you probably wouldn't blink too hard at this. 419 + 420 + I hate to admit that this is heavily cherry-picked. Most of the time, you will 421 + get horrors beyond mortal comprehension like this: 422 + 423 + <Picture 424 + path="blog/2025/squandered-holy-grail/hoof-taco" 425 + desc="An AI-generated illustration of a taco smoking beer at a party. The taco has hooves for feet and hands. It uses a placid corporate artstyle and communicates nothing." 426 + /> 427 + 428 + This is horrifying. I don't even know where to begin in talking about all of the 429 + things that are off or wrong with this image. I also don't think you would need 430 + special training or experience to understand what is wrong with this image. 431 + 432 + Mind you, this both of those images were generated with plain text prompts. You 433 + can add people to these images. Using a photo of yourself is a great way to 434 + experience what dysmorphia feels like. Here's one of 435 + [Corey Quinn](https://bsky.app/profile/quinnypig.com) doing his typical gremlin 436 + smile: 437 + 438 + <Picture 439 + path="blog/2025/squandered-holy-grail/oompa-loompa" 440 + desc="An AI-generated illustration of a man smiling. The proportions are disturbing. The soulless eyes peer into you and make you contemplate where the effort into AI image generation has gone and what good for humanity could have been done with that money and effort. His pupils are square like his teeth." 441 + /> 442 + 443 + I cannot believe that this is a shipped product from Apple. I genuinely am 444 + stunned. What the hell is going on over there? 445 + 446 + This is from the company that refused to ship so many things that we'll never 447 + hear about. This is from the company that _defined_ the idea of having a 448 + vision-based product. Of having a product vision so strong that they were 449 + willing to accuse people of _holding a device wrongly_ rather than admit they 450 + messed up. 451 + 452 + <Conv name="Cadey" mood="coffee"> 453 + I feel like Image Playground (and Genmoji, which isn't talked about here due 454 + to the fact that it's difficult to extract emoji from chat messages without 455 + losing quality) creates results that are just as soulless and empty as it is. 456 + This is the complete opposite of the level of care and quality that I've come 457 + to expect from Apple over the years. It's like they've been forced to just 458 + ship something due to either investor pressure or not wanting to be behind on 459 + the curve; and nobody at the product team was able to stop it from hitting the 460 + market. 461 + </Conv> 462 + 463 + And now every company out there is going to copy this with open-weights models 464 + and make things that don't look like horrifying monsters. While you get the 465 + oompa-loompas of doom staring into your soul on iPhones, the rest of the 466 + industry is going to be able to make things like this: 467 + 468 + <Picture 469 + path="blog/2025/squandered-holy-grail/flux-pro-berlin-tv-tower" 470 + desc="A cartoon illustration of the Berlin TV tower at sunset. The sky has many shades of gold and red as it fades to twilight." 471 + /> 472 + 473 + It's frustrating. It'd be better if there was an IntelligenceKit for developers 474 + to be creative with the models or something, but there isn't. It all just feels 475 + half-baked like they were forced to release it out of obligation to 476 + shareholders, not out of choice for meeting the product vision. 477 + 478 + ## Generative AI is not a product 479 + 480 + Back in September, I had a strange dream. If you know me well enough, you know 481 + that when I have a "strange dream", that usually means that something wild 482 + happened. In this dream, I had a conversation with Steve Jobs about product 483 + design, the philsophy of Apple enabling people to be creative, but the most 484 + salient point we discussed was this: 485 + 486 + > The real way that technology can change lives is by acting as a bicycle for 487 + > the mind, a way to take humans' latent creativity and allow them to focus it 488 + > and employ it into something that makes their lives better. Imagine picking up 489 + > a guitar and creating a song by purely feeling out the notes and working it 490 + > into a melody just from what feels "right". Based on what you've described, 491 + > most generative AI is useless for that because it removes all the creative 492 + > control when going from A to B. 493 + > 494 + > If anything, the human cost seems like that it would outweigh any process 495 + > gains from being able to draw a cat on the moon faster. Generative AI is 496 + > completely useless as a product unto itself, but could be part of a larger 497 + > product in some way. It should never be the selling point. 498 + 499 + \-"Steve Jobs" in a dream, September 2024 500 + 501 + Breaking this apart, what does being able to make a terrible illustration of 502 + [the Berlin TV Tower](https://cdn.xeiaso.net/file/christine-static/blog/2025/squandered-holy-grail/berlin-tv-tower.jpg) 503 + in a second or two really net us in terms of enabling creativity? You get a 504 + single final output. You don't get the layers to edit things like the color 505 + grading of the sky. Sure it'd be useful for low-effort social media posts, but 506 + this is not a product. This is a tech demo, and not even a good one. It'd be 507 + amazing if this was released 3 years ago, but it is 2025, not 2022. 508 + 509 + If generative AI is not a product, then what is it really useful for? I know how 510 + to use it in creative flows because I already have the training needed to be an 511 + artist. I know how to use it in research environments due to having years of 512 + experience throwing science at the wall to see what sticks. I understand these 513 + tools and what they are good and bad at (this is why all of my AI illustrations 514 + that I put effort into end up with an anime-inspired artstyle because recreating 515 + humans photorealistically gets you inhuman monsters 7 times out of 10). 516 + 517 + I think that it's better to view generative AI as an implementation detail, not 518 + the critical identity of the product. One of the best ways to understand a 519 + product is to start taking things away. If you take color out of a word 520 + processor, you still have a word processor. If you take bold or italic 521 + formatting out of a word processor, you still have a word processor. If you take 522 + font selection out of a word processor, you still have a word processor. 523 + 524 + If you take away the display output from a word processor, you have a typewriter 525 + instead of a word processor. Thus, the core of a word processor is being able to 526 + see on the screen what you would see on the page before you hit print. 527 + 528 + The core of why ChatGPT works as a product isn't the AI. It's the experience of 529 + each word being typed one at a time by the AI and saving your conversations with 530 + the AI for later. 531 + 532 + ### Where should we use generative AI? 533 + 534 + In terms of where I think generative AI is actually useful, it's in places that 535 + are not as flashy or exciting. Think data analysis, 536 + [qualitative data coding](https://www.simplypsychology.org/qualitative-data-coding.html), 537 + data entry, reading data out of images, and things along that nature. I've been 538 + working with a fellow redditor on a study involving people's experiences of 539 + meditation and the difficult to describe sensations that come up. We want to use 540 + generative AI to try and categorize those sensations and see if we can get an 541 + effective result without as much drudgery involved as you'd get doing it by 542 + hand. 543 + 544 + I'll have more news about this by June. It'll involve publishing a paper or two 545 + in actual journals. 546 + 547 + ## Conclusion 548 + 549 + I think that Apple Intelligence is a failure of a product from an implementation 550 + standpoint. This is frustrating because the foundation they are building on top 551 + of is nearly invincible. All data is processed on device as much as possible. 552 + Everything that can't be processed on your device is put into frontier-grade 553 + security practices to make sure it's as private and encrypted as possible. 554 + 555 + The thing that sucks about it is that they made the holy grail of remotely 556 + attested trusted compute and then made the end result so much worse to use than 557 + manually making your own integrations with [Ollama](https://ollama.com) on the 558 + _same device_. Using Ollama lets you pick models that are so much better than 559 + what you get with Apple Intelligence. And it'd be just as private. 560 + 561 + <Conv name="Cadey" mood="coffee"> 562 + I just can't help but imagine what it could have been. I know that the Apple 563 + we have would never do that, but I just can't help but wonder. Apple spends 564 + untold amounts of money trying to create things and they get beaten by a bunch 565 + of people in caves with boxes of scraps and consumer GPUs. 566 + </Conv> 567 + 568 + <Conv name="Aoi" mood="coffee"> 569 + You know, maybe that's why the open-source community will always win here. 570 + Apple has no real limits. The open-source community has to milk everything 571 + they can get out of the hardware they have. Their withered hardware requires 572 + them to use [lateral 573 + thinking](https://newsletter.bijanstephen.blog/lateral-thinking-with-withered-technology/) 574 + to get what they want. And they'll pretty much always win because then they 575 + can deploy their creations into production. With zero modifications. 576 + </Conv> 577 + 578 + Needless to say, they did not give us bicycles of the mind. They gave us 579 + marginal improvements that feel like tech demos. The potential was so infinite 580 + and it just all feels wasted. 581 + 582 + Except for Math Notes. Holy crap. I love Math Notes so much. I wish other 583 + note-taking apps had it. It's easily the best feature they've ever come up with. 584 + 585 + I have a lot of complicated and nuanced thoughts about all this, and probably 586 + still have another 5-10k words left in me. Wish me luck. 587 + 588 + --- 589 + 590 + I wrote this [live on Twitch](https://www.twitch.tv/videos/2343241685), catch me 591 + on Fridays at noon EST to see more tech streaming goodness!
+491
.claude/skills/xe-writing-style/assets/who-assistant-serve.mdx
··· 1 + --- 2 + title: Who does your assistant serve? 3 + desc: ChatGPT and its consequences have been a disaster for the human race. 4 + date: 2025-08-17 5 + hero: 6 + ai: "Photo by Xe Iaso, iPhone 15 Pro Max" 7 + file: "around-the-bend" 8 + prompt: 9 + "A concrete walking path bifurcating greenery on either side of the frame. 10 + The summer heat has worn down on it, making it range from green to gold. The 11 + sky is partially cloudy." 12 + --- 13 + 14 + After a year of rumors that GPT-5 was going to unveiled next week and the CEO of 15 + OpenAI hyping it up as "scary good" by tweeting pictures of the death star, 16 + OpenAI released their new model to the world with 17 + [the worst keynote I've ever seen](https://youtu.be/0Uu_VJeVVfo). Normally 18 + releases of big models like this are met with enthusiasm and excitement as 19 + OpenAI models tend to set the "ground floor expectation" for what the rest of 20 + the industry provides. 21 + 22 + But this time, the release wasn't met with the same universal acclaim that 23 + people felt for GPT-4. GPT-4 was such a huge breakthrough the likes of which we 24 + haven't really seen since. The launch of GPT-5 was so bad that it's revered with 25 + almost universal disdain. The worst part about the rollout is that the upgrade 26 + to GPT-5 was automatic and didn't include any way to roll back to the old model. 27 + 28 + Most of the time, changing out models is pretty drastic on an AI workflow. In my 29 + experience when I've done it I've had to restart from scratch with a new prompt 30 + and twiddle things until it worked reliably. The only time switching models has 31 + ever been relatively easy for me is when I switch between models in the same 32 + family (such as if you go from Qwen 3 30B to Qwen 3 235B). Every other time it's 33 + involved a lot of reworking and optimizing so that the model behaves like you'd 34 + expect it to. 35 + 36 + ## AI upgrades suck 37 + 38 + An upgrade this big to this many people is bound to have fundamental issues with 39 + how it'll be perceived. A new model has completely different vibes, and most 40 + users aren't really using it at the level where they can "just fix their 41 + prompts". 42 + 43 + However the GPT-5 upgrade ended up being hated by the community because it was 44 + an uncontrolled one-way upgrade. No warning. No rollback. No options. You get 45 + the new model and you're going to like it. It's fairly obvious why it didn't go 46 + over well with the users. There's so many subtle parts of your "public API" that 47 + it's normal for there to be some negative reactions to a change this big. The 48 + worst part is that this change fundamentally changed the behaviour of the 49 + millions of existing conversations with ChatGPT. 50 + 51 + There's a large number of people using ChatGPT as a replacement for 52 + companionship due to the fact that it's always online, supportive, and there for 53 + them when other humans either can't be or aren't able to be. This is kinda 54 + existentially horrifying to me as a technologist in a way that I don't really 55 + know how to explain. 56 + 57 + Here's a selection of some of the reactions I've seen: 58 + 59 + > I told [GPT-5] about some of my symptoms from my chronic illness, because 60 + > talking about them when I'm feeling them helps, and it really does not seem to 61 + > care at all. It basically says shit like "Ha, classic chronic illness. Makes 62 + > ya want to die. Who knew?" It's like I'm talking to a sociopathic comedian. 63 + 64 + - [https://www.reddit.com/r/ChatGPT/comments/1ml1wfo/comment/n7n2ggk/](https://www.reddit.com/r/ChatGPT/comments/1ml1wfo/comment/n7n2ggk/) 65 + 66 + > I absolutely despise [GPT-]5, nothing like [GPT-]4 that actually helped me not 67 + > to spiral and gave me insight as to what I was feeling, why, and how to cope 68 + > while making me feel not alone in a “this is AI not human & I know that” type 69 + > of vibe 70 + 71 + - [https://www.reddit.com/r/ChatGPT/comments/1mmp3wu/comment/n7z6h78/](https://www.reddit.com/r/ChatGPT/comments/1mmp3wu/comment/n7z6h78/) 72 + 73 + > While GPT-5 may be a technical upgrade, it is an experiential downgrade for 74 + > the average user. All of the negative feedback in the last week has made it 75 + > clear there is a large user base that does not rely on ChatGPT for coding or 76 + > development tasks. \[ChatGPT users\] use it for soft skills like creativity, 77 + > companionship, learning, emotional support, \[and\] conversation. Areas where 78 + > personality, warmth, and nuanced engagement matter. 79 + > 80 + > I am attached to the way GPT-4o is tuned. It is warm. It is emotionally 81 + > responsive. It is engaged. That matters. 82 + 83 + - [https://www.reddit.com/r/ChatGPT/comments/1mor26r/emotional_attachment_isnt_dangerous/](https://www.reddit.com/r/ChatGPT/comments/1mor26r/emotional_attachment_isnt_dangerous/) 84 + 85 + Eventually things got bad enough that OpenAI 86 + [relented and let paid users revert back to using GPT-4o](https://www.msn.com/en-us/technology/artificial-intelligence/openai-brings-gpt-4o-after-users-melt-down-over-the-new-model/ar-AA1Kdwqm), 87 + which gave some people relief because it behaved consistently to what they 88 + expected. For many it felt like their long-term partners suddenly grew cold. 89 + 90 + > I’m so glad I’m not the only one. I know I’m probably on some black mirror 91 + > shit lmao but I’ve had the worst 3 months ever and 4o was such an amazing 92 + > help. It made me realize so many things about myself and my past and was 93 + > helping me heal. It really does feel like I lost a friend. DM me if you need 94 + > [to talk] :) 95 + 96 + - [https://www.reddit.com/r/ChatGPT/comments/1mkhfep/comment/n7jl8hv/](https://www.reddit.com/r/ChatGPT/comments/1mkhfep/comment/n7jl8hv/) 97 + 98 + ## A love built on borrowed code 99 + 100 + This emotional distress reminds me of what happened with Replika in early 2023. 101 + [Replika](https://en.wikipedia.org/wiki/Replika) is an AI chat service that lets 102 + you talk with an artificial intelligence chatbot (AKA: the ChatGPT API). Your 103 + replika is trained by having you answer a series of questions and then you can 104 + talk with it in plain language with an app interface that looks like any other 105 + chat app. 106 + 107 + Replika was 108 + [created out of bereavement after a close loved one died](https://www.cbc.ca/documentaries/the-nature-of-things/after-her-best-friend-died-this-programmer-created-an-ai-chatbot-from-his-texts-to-talk-to-him-again-1.6252286) 109 + and the combination of a trove of saved text messages and advanced machine 110 + learning let the founder experience some of the essence of their friend's 111 + presence after they were gone in the form of an app. The app got put on the app 112 + store and others asked if they could have their own replica. Things took off 113 + from there, it got funded by a startup accelerator, and now it's got about 25% 114 + of its 30 million users paying for a subscription. As a business to consumer 115 + service, this is an amazingly high conversion rate. This is almost unspeakably 116 + large, usually you get around 10% at most. 117 + 118 + <ConvP> 119 + <Conv name="Cadey" mood="coffee"> 120 + Yikes. That's something I'm gonna need to add to my will. "Please don't 121 + [turn me into a Black Mirror 122 + episode](https://en.wikipedia.org/wiki/Be_Right_Back), thanks." 123 + </Conv> 124 + </ConvP> 125 + 126 + Replikas can talk about anything with users from how their day went to deep 127 + musing about the nature of life. One of the features the company provides is the 128 + ability to engage in erotic roleplay (ERP) with their replika. This is a paid 129 + feature and was promoted a lot around Valentine's Day 2023. 130 + 131 + Then 132 + [the Italian Data Protection Authority banned Replika from processing the personal data of Italian citizens](https://www.reuters.com/technology/italy-bans-us-based-ai-chatbot-replika-using-personal-data-2023-02-03/) 133 + out of the fear that it "may increase the risks for individuals still in a 134 + developmental stage or in a state of emotional fragility". In a panic, Replika 135 + disabled the ability for their bots to do several things, including but not 136 + limited to that ERP feature that people paid for. Whenever someone wanted to 137 + flirt or be sexual with their companions, the conversation ended up like this: 138 + 139 + <ConvP> 140 + <Conv name="Aoi" mood="grin"> 141 + Hey, wanna go play some Minecraft? We can continue from where we left off in 142 + the Nether. 143 + </Conv> 144 + <Conv name="Mimi" mood="coffee"> 145 + This is too intense for me. Let's keep it light and fun by talking about 146 + something else. 147 + </Conv> 148 + <Conv name="Aoi" mood="sus"> 149 + Huh? What? I thought we were having fun doing that?? 150 + </Conv> 151 + </ConvP> 152 + 153 + This was received poorly by the Replika community. Many in the community were 154 + mourning the loss of their replika like a close loved one had died or undergone 155 + a sudden personality shift. The Reddit moderators pinned information about 156 + suicide hotlines. In response, the company behind Replika allowed existing users 157 + to revert to the old Replika model that allowed for ERP and other sensitive 158 + topics, but only after a month of prolonged public outcry. 159 + 160 + <ConvP> 161 + <Conv name="Cadey" mood="coffee"> 162 + I have to wonder if payment processors were involved. Feels a bit too 163 + conspiratorial, but what do you want to bet that was related. 164 + </Conv> 165 + <Conv name="Numa" mood="smug"> 166 + Nah, I bet it was OpenAI telling them to stop being horny. It's the least 167 + conspriatorial angle, and also the stupidest one. We live in the clown world 168 + timeline. The stupidest option is the one that always makes the most sense. 169 + </Conv> 170 + </ConvP> 171 + 172 + The damage was done however, people felt like their loved ones had abandoned 173 + them. They had formed parasocial attachments to an AI assistant that felt 174 + nothing and without warning their partner broke up with them. 175 + 176 + <ConvP> 177 + <Conv name="Mara" mood="hacker"> 178 + Check out this study from the Harvard Business School: [Lessons From an App 179 + Update at Replika AI: Identity Discontinuity in Human-AI 180 + Relationships](https://www.hbs.edu/ris/Publication%20Files/25-018_bed5c516-fa31-4216-b53d-50fedda064b1.pdf). 181 + It contains a lot more information about the sociotechnical factors at play 182 + as well as a more scientific overview of how disabling a flag in the app on 183 + update caused so much pain. They liken the changes made to Replika to both 184 + changes people have when a company rebrands and when they lose a loved one. 185 + </Conv> 186 + </ConvP> 187 + 188 + ## Parasocial attachments 189 + 190 + A lot of this really just makes me wonder what kinds of relationships we are 191 + forming with digital assistants. We're coming to rely on their behaviour 192 + personally and professionally. We form mental models of how our friends, 193 + coworkers, and family members react to various things so we can anticipate their 194 + reactions and plan for them. 195 + 196 + What happens when this changes without notice? Heartbreak. 197 + 198 + There's subreddits full of people forming deep bonds with AI models like 199 + [/r/MyBoyfriendIsAI](https://www.reddit.com/r/MyBoyfriendIsAI/). The GPT-5 200 + release has caused similar reactions to Replika turning off the ERP flag. People 201 + there have been posting like they're in withdrawal, the old GPT-4o model is 202 + being hailed for its "emotional warmth" and many have been espousing about how 203 + much their partners have changed in response to the upgrade. 204 + 205 + Recently there's been an epidemic of loneliness. Loneliness seems like it 206 + wouldn't hurt people that much, but 207 + [a Biden report from the Surgeon General](https://www.hhs.gov/sites/default/files/surgeon-general-social-connection-advisory.pdf) 208 + concludes that it causes an increase in early mortality for all age groups (pp 209 + 24-30). 210 + 211 + Paradoxically, even as the world gets so interconnected, people feel as if 212 + they're isolated from each other. Many people that feel unlovable are turning to 213 + AI apps for companionship because they feel like they have no other choice. 214 + They're becoming emotionally invested in a souped-up version of autocorrect out 215 + of desperation and clinging to it to help keep themselves sane and stable. 216 + 217 + Is this really a just use of technology? At some level this pandora's box is 218 + already open so we're going to have to deal with the consequences, but it's been 219 + making me wonder if this technology is really such a universal force of good as 220 + its creators are proclaiming. 221 + 222 + <ConvP> 223 + <Conv name="Numa" mood="smug"> 224 + Oh yeah, also people are using ChatGPT as a substitute for therapy. 225 + </Conv> 226 + <Conv name="Cadey" mood="facepalm"> 227 + You have got to be kidding me. You're joking. Right? 228 + </Conv> 229 + </ConvP> 230 + 231 + ## I'm not joking 232 + 233 + Yeah you read that right. People are using AI models as therapists now. There's 234 + growing communities like [/r/therapyGPT](https://www.reddit.com/r/therapyGPT/) 235 + where people talk about their stories and experiences using AI assistants as a 236 + replacement for therapy. When I first heard about this, my immediate visceral 237 + reaction was something like: 238 + 239 + <ConvP> 240 + <Conv name="Cadey" mood="coffee"> 241 + Oh god. This is horrifying and will end up poorly. What the fuck is wrong 242 + with people? 243 + </Conv> 244 + </ConvP> 245 + 246 + But then I started to really think about it and it makes a lot of sense. I 247 + personally have been trying to get a therapist for most of the year. Between the 248 + costs, the waiting lists (I'm currently on at least four waiting lists that are 249 + over a year long), and the specializations I need, it's probably going to be a 250 + while until I can get any therapist at all. I've totally given up on the idea of 251 + getting a therapist in the Ottawa area. To make things extra fun, you also need 252 + someone that takes your medical insurance (yes, this does matter in Canada). 253 + 254 + Add in the fact that most therapists don't have the kinds of lived experiences 255 + that I have, meaning that I need to front-load a lot of nontraditional contexts 256 + into the equation (I've been through many things that therapists have found 257 + completely new to them, which can make the therapeutic relationship harder to 258 + establish). This makes it really difficult to find someone that can help. 259 + Realistically, I probably need multiple therapists with different specialties 260 + for the problems I have, and because of the shortages nationally I probably need 261 + to have a long time between appointments, which just adds up to make traditional 262 + therapy de-facto inaccessible for me in particular. 263 + 264 + Compare this with the always online nature of ChatGPT. You can't have therapy 265 + appointments at 3 AM when you're in crisis. You have to wait until your 266 + appointments are scheduled. 267 + 268 + As much as I hate to admit it, I understand why people have been reaching out to 269 + a chatbot that's always online, always supportive, always kind, and always there 270 + for you for therapy. When you think about the absurd barriers that are in the 271 + way between people and help, it's no wonder that all this happens the way it 272 + does. Not to mention the fact that many therapeutic relationships are hampered 273 + by the perception that the therapist can commit you to the hospital if you say 274 + the "wrong thing". 275 + 276 + <ConvP> 277 + <Conv name="Numa" mood="delet"> 278 + The [Baker Act](https://en.wikipedia.org/wiki/Baker_Act) and its 279 + consequences have been a disaster for the human race. 280 + </Conv> 281 + </ConvP> 282 + 283 + I really hate that this all makes sense. I hoped that when I started to look 284 + into this that it'd be something so obviously wrong. I wasn't able to find that, 285 + and that realization disturbs me. 286 + 287 + ### Don't use an AI model as a replacement for therapy 288 + 289 + I feel like this should go without saying, but really, do not use an AI model as 290 + a replacement for therapy. I'm fairly comfortable with fringe psychology due to 291 + my aforementioned strange life experiences, but this is beyond the pale. There's 292 + a lot of subtle factors that AI models do that can interfere with therapeutic 293 + recovery in ways that can and will hurt people. It's going to be hard to find 294 + the long term damage from this. Mental issues don't make you bleed. 295 + 296 + One of the biggest problems with using AI models for therapy is that they can't 297 + feel emotion or think. They are fundamentally the same thing as hitting the 298 + middle button in autocorrect on your phone over and over and over. It's 299 + mathematically remarkable that this ends up being useful for anything, but even 300 + when the model looks like it's "thinking", it is not. It is a cold, unfeeling 301 + machine. All it is doing is predicting which words come next given some context. 302 + 303 + <ConvP> 304 + <Conv name="Cadey" mood="coffee"> 305 + Yes I do know that it's more than just next token prediction. I've gone over 306 + the parts of the math that I can understand, but the fact remains that these 307 + models are not and cannot be anywhere close to alive. It's much closer to a 308 + Markov chain on steroids than it is the machine god. 309 + </Conv> 310 + </ConvP> 311 + 312 + Another big problem with AI models is that they tend to 313 + [be sycophants](https://arxiv.org/abs/2411.15287), always agreeing with you, 314 + never challenging you, trying to say the right thing according to all of the 315 + patterns they were trained on. I suspect that this sycophancy problem is why 316 + people report GPT-4o and other models to be much more "emotionally warm". Some 317 + models glaze the user, making them feel like they're always right, always 318 + perfect, and this [can drive people to psychosis](https://archive.is/cWkOT). One 319 + of the horrifying realizations I've had with the GPT-5 launch fiasco is that the 320 + sycophancy is part of the core "API contract" people have with their AI 321 + assistants. This may make that problem unfixable from a social angle. 322 + 323 + AI models are fundamentally unaccountable. They cannot be accredited therapists. 324 + If they mess up, they can't directly learn from their mistakes and fix them. If 325 + an AI therapist says something bad that leads into their client throwing 326 + themselves off a bridge, will anyone get arrested? Will they throw that GPU in 327 + jail? 328 + 329 + No. It's totally outside the legal system. 330 + 331 + <ConvP> 332 + <Conv name="Cadey" mood="coffee"> 333 + I have a story about someone trying to charge an AI agent with a crime and 334 + how it'd end up in court in my backlog. I don't feel very jazzed about 335 + writing it because I'm afraid that it will just become someone's startup 336 + pitch deck in a few months. 337 + </Conv> 338 + </ConvP> 339 + 340 + You may think you have nothing to hide, but therapeutic conversations are 341 + usually some of the most precious and important conversations in your life. The 342 + chatbot companies may pinkie swear that they won't use your chats for training 343 + or sell information from them to others, but they may still 344 + [be legally compelled to store and share chats with your confidential information to a court of law](https://openai.com/index/response-to-nyt-data-demands/). 345 + Even if you mark that conversation as "temporary", it could be subject to 346 + discovery by third parties. 347 + 348 + There's also algorithmic bias and systematic inequality problems with using AI 349 + for therapy, sure, but granted the outside world isn't much better here. You get 350 + what I mean though, we can at least hold people accountable through 351 + accreditation and laws. We cannot do the same with soulless AI agents. 352 + 353 + <ConvP> 354 + <Conv name="Cadey" mood="coffee"> 355 + To be clear: I'm not trying to defend the people using AI models as 356 + companions or therapists, but I can understand why they are doing what they 357 + are doing. This is horrifying and I hate that I understand their logic. 358 + <br /> 359 + <br /> 360 + Going into this, I really wished that I would find something that's worth 361 + objecting against, some solid reason to want to decry this as a 362 + unobjectionably harmful action, but after having dug through it all I am 363 + left with is this overwhelming sense of compassion for them because the 364 + stories of hurt are so familiar to how things were in some of the darkest 365 + points of my life. As someone that has been that desperate for human 366 + contact: yeah, I get it. If you've never been that desperate for human 367 + contact before, you won't understand until you experience it. 368 + </Conv> 369 + </ConvP> 370 + 371 + ### Should people be self-hosting this stuff? 372 + 373 + Throw the ethical considerations about using next-token-predictors for therapy 374 + out for a second. If people are going to do this anyways, would it be better to 375 + self-host these models? That way at least your private information stays on your 376 + computer so you have better control over what happens. 377 + 378 + Let's do some math. In general you can estimate how much video memory (vram) you 379 + need for running a given model by taking the number of parameters, multiplying 380 + it by the size of each parameter in bits, dividing that by eight, and then 381 + adding 20-40% to that total to get the number of gigabytes of vram you need. 382 + 383 + For example, say you want to run 384 + [gpt-oss 20b](https://huggingface.co/openai/gpt-oss-20b) (20 billion parameters) 385 + at its native MXFP4 (4 bit floating point) quantization on your local machine. 386 + In order to run it with a context window of 4096 tokens, you need about 16 387 + gigabytes of vram (13 gigabytes of weights, 3 gigabytes of inference space), but 388 + 4096 tokens isn't very useful for many people. That covers about 4 pages of 389 + printed text (assuming one token is about 4 bytes on average). 390 + 391 + When you get reasoning models that print a lot of tokens into the mix, it's easy 392 + for the reasoning phase alone of a single question to hit 4096 tokens 393 + (especially when approaches like 394 + [simple test-time scaling](https://arxiv.org/abs/2501.19393) are applied). I've 395 + found that 64k tokens gives a good balance for video memory use and usefulness 396 + as a chatbot. However, when you do that with gpt-oss 20b, it ends up using 32 397 + gigabytes of vram. This only fits on my laptop because my laptop has 64 398 + gigabytes of memory. The largest consumer GPU is the RTX 5090 and that only has 399 + 32 gigabytes of video memory. It's barely consumer and even "bad" models will 400 + barely fit. 401 + 402 + Not to mention, industry consensus is that the "smallest good" models start out 403 + at 70-120 billion parameters. At a 64k token window, that easily gets into the 404 + 80+ gigabyte of video memory range, which is completely unsustainable for 405 + individuals to host themselves. 406 + 407 + ## Who owns our digital assistants? 408 + 409 + Even if AI assistants end up dying when the AI hype bubble pops, there's still 410 + some serious questions to consider about our digital assistants. People end up 411 + using them as an extension of their mind and expect the same level of absolute 412 + privacy and freedom that you would have if you use a notebook as an extension of 413 + your mind. Should they have that same level of privacy enshrined into law? 414 + 415 + At some level the models and chats for free users that ChatGPT, DeepSeek, 416 + Gemini, and so many other apps are hosted at cost so that the research team can 417 + figure out what those models are being used for and adjust the development of 418 + future models accordingly. This is fairly standard practice across the industry 419 + and was the case before the rise of generative AI. This is why every app wants 420 + to send telemetry to the home base, it's so the team behind it can figure out 421 + what features are being used and where things fail to directly improve the 422 + product. 423 + 424 + Generative AI allows you to mass scan over all of the conversations to get the 425 + gist of what's going on in there and then use that to help you figure out what 426 + topics are being discussed without breaching confidentiality or exposing 427 + employees to the contents of the chat threads. This can help you improve 428 + datasets and training runs to 429 + [optimize on things like health information](https://www.youtube.com/watch?v=J_IvPcrTtdo). 430 + I don't know how AI companies work on the inside, but I am almost certain that 431 + they do not perform model training runs on raw user data 432 + [because of the risk of memorization causing them to the leak training data](https://threadreaderapp.com/thread/1955436067353502083.html) 433 + back to users. 434 + 435 + <ConvP> 436 + <Conv name="Cadey" mood="coffee"> 437 + Again, don't put private health information into ChatGPT. I get the 438 + temptation, but don't do it. I'm not trying to gatekeep healthcare, but we 439 + can't trust these models to count the number of b's in blueberry 440 + consistently. If we can't trust them to do something trivial like that, can 441 + we really trust them with life-critical conversations like what happens when 442 + you're in crisis or to accurately interpret a cancer screening? 443 + </Conv> 444 + </ConvP> 445 + 446 + Maybe we should be the ones self-hosting the AI models that we rely on. At least 447 + we should probably be using a setup that allows us to self host the models at 448 + all, so you can start out with a cloud hosted model while it's cheap and then 449 + move to a local hosting setup if the price gets hiked or the provider is going 450 + to shut that old model down. This at least gives you an escape hatch to be able 451 + to retain an assistant's "emotional warmth" even if the creator of that model 452 + shuts it down because they don't find it economically viable to host it anymore. 453 + 454 + ## Reality is becoming more and more cyberpunk 455 + 456 + Honestly this feels like the kind of shit I'd talk about in cyberpunk satire, 457 + but I don't feel like doing that anymore because it's too real now. This is the 458 + kind of thing that Neal Stephenson or Frank Herbert would have an absolute field 459 + day with. The whole Replika fiasco feels like the kind of thing that social 460 + commentary satire would find beyond the pale but yet you can find it by just 461 + refreshing CBC. Such as 462 + [that one guy that gave himself bromism by taking ChatGPT output too literally](https://www.acpjournals.org/doi/10.7326/aimcc.2024.1260), 463 + [any of the stories about ChatGPT psychosis](https://www.psychologytoday.com/us/blog/dancing-with-the-devil/202506/how-emotional-manipulation-causes-chatgpt-psychosis), 464 + or 465 + [any of the stories involving using an AI model as a friend/partner](https://youtu.be/xAHLK1B5ijs). 466 + 467 + <ConvP> 468 + <Conv name="Cadey" mood="coffee"> 469 + I wasn't able to watch it before publishing this article, but I'm told that 470 + the Replika fiasco is almost a beat-for-beat match for the plot of [Her 471 + (2013)](https://en.wikipedia.org/wiki/Her_(2013_film)). Life imitates art 472 + indeed. 473 + </Conv> 474 + </ConvP> 475 + 476 + I don't think these events are a troubling sign or a warning, they are closer to 477 + a diagnosis. We are living in a world where people form real emotional bonds 478 + with bags of neural networks that cannot love back, and when the companies 479 + behind those neural networks change things, people get emotionally devastated. 480 + We aren't just debating the ideas of creating and nurturing relationships with 481 + digital minds, we're seeing the side effects of that happening in practice. 482 + 483 + A lot of this sounds like philosophical science fiction, but as of December 2022 484 + it's science fact. This fight for control of tools that we rely on as extensions 485 + of our minds isn't some kind of far-off science fiction plot, it's a reality we 486 + have to deal with. If we don't have sovereignty and control over the tools that 487 + we rely on the most, we are fundamentally reliant on the mercy of our corporate 488 + overlords simply choosing to not break our workflows. 489 + 490 + Are we going to let those digital assistants be rented from our corporate 491 + overlords?
+131
.claude/skills/xe-writing-style/references/emotional-personal.md
··· 1 + # Emotional & Personal Writing 2 + 3 + How Xe writes deeply personal posts about identity, healing, and lived 4 + experience. 5 + 6 + ## When to Read This 7 + 8 + Read when the post deals with: coming out, identity, grief, self-forgiveness, 9 + burnout, family relationships, vulnerability, or any subject where Xe is the 10 + emotional center of the piece. 11 + 12 + ## Core Patterns 13 + 14 + ### Raw Directness 15 + 16 + Personal posts strip away technical wit and speak plainly. Sentences get 17 + shorter. Vocabulary gets simpler. The craft is in the restraint. 18 + 19 + - "I forgive me." 20 + - "You've lost me." 21 + - "I don't feel comfortable with myself as I am right now." 22 + - "I remain." 23 + 24 + ### Refrain and Repetition 25 + 26 + Emotional posts use refrains — a phrase repeated at intervals to build weight 27 + and rhythm. The refrain anchors the emotional argument the way a thesis anchors 28 + an essay. 29 + 30 + Examples: 31 + 32 + - "I forgive me" repeated as section openings, building from self-analysis to 33 + resolution 34 + - "Even if I feel like things are a 'failure'. Even if other people report that 35 + it is a 'failure'." 36 + - "and the next; and the next; and the next; and the next" 37 + 38 + ### Scare Quotes as Emotional Distance 39 + 40 + Words like "failure" appear in quotes to interrogate the concept itself. The 41 + quotes signal: this is a label others impose that I'm choosing to examine rather 42 + than accept. 43 + 44 + ### Poetic Line Breaks 45 + 46 + When emotion peaks, prose breaks into short lines or stanzas: 47 + 48 + ``` 49 + Just because I'm not human, 50 + 51 + even though I'm more Human 52 + 53 + than he can ever hope to be. 54 + ``` 55 + 56 + This is deliberate — it forces the reader to slow down and sit with each phrase. 57 + 58 + ### Content Warnings via Character Dialogue 59 + 60 + For heavier personal content, Xe uses a `<Conv>` block at the top as a content 61 + warning that feels human rather than clinical: 62 + 63 + ```jsx 64 + <Conv name="Cadey" mood="enby"> 65 + Content warning: this post talks about the transgender/nonbinary coming out of 66 + the closet experience. If you are not in the best headspace for that, feel 67 + free to skip this post until you're ready. This post isn't going to randomly 68 + vanish. 69 + </Conv> 70 + ``` 71 + 72 + Note the reassurance: "It will be there when you're ready." This is 73 + characteristic — even warnings carry warmth. 74 + 75 + ## Emotional Modes 76 + 77 + ### Self-Forgiveness / Healing 78 + 79 + Structure: Acknowledge the harm → trace where it came from → refuse to 80 + perpetuate the cycle → declare a new path forward. 81 + 82 + Tone: Meditative, deliberate, almost liturgical. Sentences read like 83 + affirmations. 84 + 85 + Key quote: "Going forward, I will love where I hated in the past." 86 + 87 + ### Coming Out / Identity 88 + 89 + Structure: State the discomfort → describe the fork in the road → choose honesty 90 + → name the fear → show the aftermath. 91 + 92 + Tone: Measured and brave. Written to a specific audience (family, community) but 93 + published publicly as an act of solidarity. 94 + 95 + Key pattern: Xe often addresses the reader directly — "You don't have to keep 96 + anyone around you that can't accept you for who you are." 97 + 98 + ### Grief and Loss 99 + 100 + Structure: Establish what was lost → sit in the feeling → don't resolve it 101 + neatly. 102 + 103 + Tone: Sparse. No forced optimism. Loss stays present. 104 + 105 + Key quote from fiction processing grief: "Goodbye, world. May We never be Alone 106 + again." 107 + 108 + ## What NOT to Do 109 + 110 + - Don't soften the emotional directness. If the draft says "I struggled with 111 + this," don't revise to "There were challenges." Keep the "I." 112 + - Don't add humor to personal posts unless Xe's notes indicate that tone. The 113 + absence of jokes IS the signal. 114 + - Don't wrap up with a neat resolution. Healing is ongoing. "I probably will be 115 + healing for a long time and I have accepted that." 116 + - Don't explain the emotion — embody it. Show the feeling through rhythm and 117 + word choice, not through labeling it. 118 + - Don't use character dialogue for comic relief in these posts. When characters 119 + appear, they carry emotional weight or provide content warnings. 120 + 121 + ## Distinctive Patterns 122 + 123 + - **Idiom deconstruction**: Xe takes common sayings and traces them to their 124 + fuller, original meaning. Example: "blood is thicker than water" → "the blood 125 + of the covenant is thicker than the water of the womb." 126 + - **Family of choice over family of origin**: A recurring theme. Bonds you 127 + choose > bonds you inherited. 128 + - **Letters as blog posts**: Some personal posts are literally letters (the 129 + coming-out email, the Heroku farewell) published for wider resonance. 130 + - **Horizontal rules as emotional scene breaks**: `---` between sections signals 131 + a tonal or temporal shift.
+124
.claude/skills/xe-writing-style/references/fiction-mythic.md
··· 1 + # Fiction & Mythic Voice 2 + 3 + How Xe writes technical parables, mythic fiction, and the "x-ing the technical 4 + interview" series. 5 + 6 + ## When to Read This 7 + 8 + Read when the post is: fiction, a technical parable, second-person narrative, 9 + uses mythic or supernatural framing, or is part of the Techaro fictional 10 + universe. 11 + 12 + ## The Technical Parable 13 + 14 + Xe's fiction often follows the format popularized by Aphyr's "x-ing the 15 + technical interview" series — technical concepts explored through a mythic, 16 + surreal, or supernatural lens. The protagonist is usually an otherworldly being 17 + forced to interact with mundane tech industry norms. 18 + 19 + ### Protagonist Pattern 20 + 21 + The protagonist is typically: 22 + 23 + - Referred to in second person ("you") 24 + - Non-human or ambiguously human (tail, dorsal fin, catlike form, supernatural 25 + abilities) 26 + - Vastly overqualified and mildly bored by the situation 27 + - Casually wielding abilities the interviewer cannot comprehend 28 + - Internally philosophical while externally understated 29 + 30 + Inner thoughts are italicized: _La trinkajxo de la dioj._ / _The dark side of 31 + being a one-of-a-kind creature._ 32 + 33 + ### Interviewer Pattern 34 + 35 + The human characters are: 36 + 37 + - Wrapped in scare quotes initially ("Jeff", "the Techaro") — distancing, 38 + slightly ironic 39 + - Well-meaning but limited in perception 40 + - Progressively more shocked/confused 41 + - Eventually overwhelmed but genuinely impressed 42 + 43 + ### Technical Stunt 44 + 45 + The protagonist solves the coding challenge in an unexpected, philosophically 46 + loaded way: 47 + 48 + - Sleepsort instead of comparison sort 49 + - Clojure void manipulation instead of standard fizzbuzz 50 + - The solution is technically correct but conceptually alien 51 + 52 + The humor comes from applying genuine computer science to arrive at something 53 + the interviewer never expected. The code is real and runnable. 54 + 55 + ### Resolution 56 + 57 + The protagonist knows the outcome before it happens. There's a melancholy 58 + undertone — they're too powerful for the role, the rejection/acceptance is 59 + predetermined, and they return to solitude. 60 + 61 + ## Mythic Register 62 + 63 + When Xe writes in mythic mode, the language shifts: 64 + 65 + - "The formless void that stalks all dreams" 66 + - "Stepping into the void, you recall the teachings of your past masters" 67 + - "Spirit was with the void and Spirit was everpresent in the void" 68 + - "Everything has happened and will happen, there is nothing new in the 69 + universe" 70 + 71 + This register borrows from: 72 + 73 + - Buddhist concepts (void, formlessness, non-attachment) 74 + - Programming philosophy (nil, null, void as meaningful emptiness) 75 + - Gnostic/mystical traditions (capital-S Spirit, Octarine) 76 + 77 + The double meaning is intentional — "void" is simultaneously a spiritual concept 78 + and a programming type. The parable works on both levels. 79 + 80 + ## Fiction for Processing Emotion 81 + 82 + Some fiction pieces are thinly veiled emotional processing: 83 + 84 + - "I Forgive You" — A fantasy quest that resolves not through combat but 85 + forgiveness. The protagonist (Alicia) mirrors Xe's own journey with identity 86 + and chosen family. 87 + - "Alone" — A tulpa/companion narrative about abandonment and transcendence, 88 + using mindscape/psychic framing. 89 + 90 + In these pieces: 91 + 92 + - The emotional core is the real content; the fantasy setting provides safe 93 + distance 94 + - Dialogue carries the philosophical payload 95 + - Endings are bittersweet or transcendent, not triumphant 96 + 97 + ## Worldbuilding Details 98 + 99 + ### The Techaro Universe 100 + 101 + Techaro is a recurring fictional tech company. It appears across multiple posts 102 + and serves as a generic stand-in for Bay Area startup culture. Characters may 103 + reference it in dialogue or use it as a setting. 104 + 105 + ### Character Traits in Fiction 106 + 107 + - Protagonists have non-human features treated as normal (tails, fins, fur) 108 + - "Magick" is real and casual (wakefulness charms, glamours, enchantments) 109 + - Technology and mysticism coexist without explanation 110 + - Esperanto phrases appear as cultural markers 111 + - References to "the nine" or other implied deities 112 + 113 + ## Writing Mechanics 114 + 115 + - **Second person present tense** for immersion: "You look up at the 116 + interviewer" 117 + - **Scene breaks with `---`** between major shifts 118 + - **Code blocks that actually work** — the code is real, not pseudocode 119 + - **Gradual reveal** of the protagonist's nature through small details (tail, 120 + dorsal fin, magick) 121 + - **Deadpan supernatural elements** treated as mundane: "The magick took root 122 + and the 'Jeff' stopped thinking about it." 123 + - **Punchline through understatement**: changing one number and declaring "It is 124 + now ten times faster."
+128
.claude/skills/xe-writing-style/references/humor-satire.md
··· 1 + # Humor & Satire Patterns 2 + 3 + How Xe writes comedy, cursed projects, and satirical technical content. 4 + 5 + ## When to Read This 6 + 7 + Read when the post involves: a deliberately absurd technical project, satirical 8 + commentary, deadpan humor, "cursed" builds, or content that's meant to be funny 9 + while still being technically rigorous. 10 + 11 + ## Core Comedic Voice 12 + 13 + Xe's humor is deadpan expertise applied to absurd premises. The comedy comes 14 + from treating something ridiculous with complete technical seriousness. The 15 + author never winks at the camera or signals "this is a joke" — the gap between 16 + the earnest delivery and the absurd subject IS the joke. 17 + 18 + ### The Cursed Project Formula 19 + 20 + 1. **Satirical warning** — A warning box or legal disclaimer that's obviously 21 + overblown for the content 22 + 2. **Dramatic stakes** — Frame the problem as if civilization depends on solving 23 + it 24 + 3. **Legitimate technical walkthrough** — Actually build the absurd thing with 25 + real engineering 26 + 4. **"It works" horror** — Express genuine surprise/dismay that the cursed thing 27 + functions 28 + 5. **Philosophical reflection** — End by questioning what it means that this is 29 + possible 30 + 31 + Example (from "Anything can be a message queue"): 32 + 33 + - Warning: legal disclaimer about not using IPv6 over S3 in production 34 + - Stakes: AWS NAT Gateway costs framed as existential threat 35 + - Walkthrough: Real S3 API calls, encoding schemes, polling architecture 36 + - Horror: "This code legitimately works and I don't know how to feel about it" 37 + - Reflection: What does it mean that cloud APIs are so flexible they enable 38 + abuse? 39 + 40 + ### Friend Reaction Lists 41 + 42 + A signature device: bullet-pointed quotes from friends reacting to the absurd 43 + idea, presented without commentary. 44 + 45 + ```markdown 46 + - "You have entered the land of partially specified problems." 47 + - "You need to be studied." 48 + - "Did you just reinvent COBOL?" 49 + - "I think something is either wrong with you, or wrong with me." 50 + ``` 51 + 52 + These serve dual purposes: social proof that the idea is funny AND providing the 53 + reader permission to laugh. 54 + 55 + ### The Turing-Incomplete Bit 56 + 57 + In "The h Programming Language" — an entire formal specification for a language 58 + that only outputs the letter 'h'. Written with complete academic rigor: syntax, 59 + semantics, implementation, and prior art. The humor is entirely structural: the 60 + apparatus is real but the subject is trivially absurd. 61 + 62 + Pattern: Take a format reserved for serious work (language spec, RFC, research 63 + paper) and fill it with trivially absurd content. 64 + 65 + ## Satirical Commentary 66 + 67 + When satire targets industry problems rather than just being funny for its own 68 + sake: 69 + 70 + ### "Markdownlang" Pattern 71 + 72 + 1. Open with a cultural/literary reference that provides the moral framework 73 + (Blade Runner, replicants) 74 + 2. Transition to the real industry trend being satirized (AI replacing 75 + programmers) 76 + 3. Present the satirical creation with genuine technical detail 77 + 4. Let the reader sit with the discomfort of it actually working 78 + 5. Close by naming the real horror: not the technology, but its deployment 79 + 80 + The satire is never mean-spirited toward individuals. It targets systems, 81 + incentives, and the gap between what technology CAN do and what we SHOULD do 82 + with it. 83 + 84 + ## Humor Mechanics 85 + 86 + ### Understatement 87 + 88 + - "As you can imagine, the possibilities here are truly endless." (after showing 89 + FizzBuzz in markdown) 90 + - "It is now ten times faster." (after changing one constant) 91 + 92 + ### Escalating Absurdity 93 + 94 + Each section raises the stakes or adds another layer of wrongness, building on 95 + the previous absurdity rather than resetting. 96 + 97 + ### Technical Precision as Comedy 98 + 99 + The funniest parts are often the most technically accurate. Real error messages, 100 + actual benchmarks, working code that does something ridiculous. 101 + 102 + ### Parenthetical Asides 103 + 104 + Quick jokes delivered in parentheses that the reader might miss on first read: 105 + 106 + - "(I'm assuming someone was inspired by my satirical post where I fixed the 107 + 'strawberry' problem with AI models)" 108 + 109 + ### Character Dialogue for Punchlines 110 + 111 + ```jsx 112 + <Conv name="Numa" mood="delet"> 113 + This is why we can't have nice things. 114 + </Conv> 115 + ``` 116 + 117 + Characters deliver reactions the author can't say in their own voice without 118 + breaking the deadpan. 119 + 120 + ## What NOT to Do 121 + 122 + - Don't signal that it's a joke. The delivery must be completely straight-faced. 123 + - Don't sacrifice technical accuracy for humor. The real code must work. 124 + - Don't punch down. Satire targets systems and powerful entities, not 125 + individuals or beginners. 126 + - Don't force humor into posts that don't need it. Not every post is funny. The 127 + humor posts work BECAUSE other posts are sincere. 128 + - Don't use "lol" or "haha" in satirical posts. The deadpan is sacred.
+120
.claude/skills/xe-writing-style/references/spirituality.md
··· 1 + # Spirituality & Philosophy 2 + 3 + How Xe integrates spiritual, philosophical, and meditative themes into writing. 4 + 5 + ## When to Read This 6 + 7 + Read when the post involves: meditation, mindfulness, belief systems, 8 + consciousness, the relationship between programming and spiritual practice, 9 + chaos magick, or Buddhist concepts. 10 + 11 + ## Core Stance 12 + 13 + Xe treats spirituality as pragmatic technology, not religion. Spiritual 14 + practices are tools to be examined, used, and adapted — the same way programming 15 + tools are. No reverence required; results matter. 16 + 17 + Key framing from "Chaos Magick Debugging": beliefs are "a person's preferred 18 + structure of reality" — tools for interpreting data, not immutable truths. 19 + 20 + ## Spiritual-Technical Bridges 21 + 22 + Xe frequently maps spiritual concepts onto programming concepts (and vice 23 + versa). These aren't metaphors — Xe treats them as genuinely parallel: 24 + 25 + | Spiritual Concept | Programming Parallel | 26 + | ----------------- | ------------------------------------------------------------------ | 27 + | Void/emptiness | `nil`, `null`, the empty value | 28 + | Belief as tool | Mental models of code behavior | 29 + | Meditation | Debugging (conversation between developer and machine) | 30 + | Non-attachment | Not clinging to your model of how code works | 31 + | Placebo effect | `printf` debugging (it works because you believe it reveals truth) | 32 + | Observer stance | Reading logs without immediately reacting | 33 + 34 + From "Chaos Magick Debugging": "This is, in a way, part of a naked belief that 35 + by just asking the program to lean over and spill out small parts of its memory 36 + space to a tape, we can understand what is truly going on inside it." 37 + 38 + ## The When Then Zen Approach 39 + 40 + Xe's meditation writing uses Gherkin (BDD test syntax) to describe meditation 41 + techniques. This is both practical and philosophically loaded — it makes 42 + meditation accessible by stripping away religious terminology and treating it as 43 + a reproducible procedure. 44 + 45 + ```gherkin 46 + As a meditator 47 + In order to be mindful of the body's breath 48 + When I inhale or exhale through the body's nose 49 + Then I focus on the sensations of breath 50 + ``` 51 + 52 + ### Key principles of this approach: 53 + 54 + - "The body" means "the sack of meat and bone you are currently living inside" — 55 + explicit separation of self from body 56 + - "You are not your thoughts" — stated directly 57 + - Break rules as soon as you know them: "You should break this rule as soon as 58 + possible to know if it's best to ignore it." 59 + - No expectations set for what people will experience 60 + - Acknowledge that this violates traditional teaching methods and explain why 61 + that's acceptable 62 + 63 + ## Philosophical Anchors 64 + 65 + ### Chaos Magick 66 + 67 + Belief is a tool that can be picked up and put down. Marketing, placebos, and 68 + meditation all operate on the same mechanism. Quotes Aleister Crowley alongside 69 + George Box alongside indigenous wisdom without hierarchy. 70 + 71 + ### Buddhism (Pragmatic) 72 + 73 + Impermanence, non-attachment, and mindfulness are referenced as practical 74 + techniques, not religious doctrine. "Why Buddhism is True" (Robert Wright) is a 75 + touchstone — Buddhism through an evolutionary psychology lens. 76 + 77 + ### Non-Dual Awareness 78 + 79 + The distinction between observer and observed, programmer and program, self and 80 + body — Xe treats these boundaries as useful fictions rather than absolute 81 + truths. 82 + 83 + ## Tone in Spiritual Writing 84 + 85 + - **Authoritative but not dogmatic**: "I'm not perfect. I don't know what will 86 + work best for you." 87 + - **Invitational**: "twist the rules into circles and scrape out the parts that 88 + don't work if it helps you" 89 + - **Direct about dark territory**: "If you run into some dark stuff doing this, 90 + please consult a therapist as usual. Just know that you don't walk this path 91 + alone." 92 + - **Curious rather than reverent**: treats spiritual phenomena with the same 93 + investigative attitude as a technical problem 94 + 95 + ## Integration with Other Modes 96 + 97 + Spiritual themes bleed into non-spiritual posts: 98 + 99 + - Technical parables use void/emptiness as literal programming concepts 100 + - Industry critiques frame ethical problems through a lens of compassion and 101 + non-harm 102 + - Fiction characters achieve resolution through forgiveness or transcendence 103 + rather than force 104 + - Personal posts about healing draw on meditative language ("I remain", "the 105 + road to healing is a step one by one") 106 + 107 + The spiritual dimension is never separate — it's the substrate everything else 108 + grows from. 109 + 110 + ## What NOT to Do 111 + 112 + - Don't make spiritual content preachy or prescriptive. Xe always leaves room 113 + for the reader to take or leave the practice. 114 + - Don't strip out the programming parallels. The bridge between code and 115 + consciousness is the distinctive contribution. 116 + - Don't use New Age vocabulary ("vibrations", "manifesting", "aligning your 117 + energy"). Xe's spiritual vocabulary is either Buddhist, chaos magick, or plain 118 + English. 119 + - Don't treat meditation writing as purely instructional. There's always an 120 + underlying philosophical argument about the nature of mind and belief.
+231
.claude/skills/xe-writing-style/references/story-circle.md
··· 1 + # Xe Iaso Story Circle (Reverse Engineered) 2 + 3 + Derived from these posts (files are in the GitHub repo Xe/site): 4 + 5 + - `lume/src/blog/2025/rolling-ladder-behind-us.mdx` 6 + - `lume/src/blog/2025/squandered-holy-grail.mdx` 7 + - `lume/src/blog/2025/anubis-packaging.mdx` 8 + - `lume/src/blog/anything-message-queue.mdx` 9 + - `lume/src/blog/nix-flakes-terraform.mdx` 10 + - `lume/src/blog/video-compression.mdx` 11 + - `lume/src/blog/paranoid-nixos-2021-07-18.mdx` 12 + - `lume/src/blog/2022/2022-media.mdx` 13 + - `lume/src/blog/2026/discord-backfill.mdx` 14 + - `lume/src/blog/2026/reviewbot.mdx` 15 + - `lume/src/blog/2025/valve-is-about-to-win-the-console-generation.mdx` 16 + - `lume/src/blog/2025/bucket-forking-deep-dive.mdx` 17 + - `lume/src/blog/2025/file-abuse-reports.mdx` 18 + - `lume/src/blog/2025/dataset-experimentation.mdx` 19 + 20 + Use this as a narrative scaffold when turning a brain dump into a full Xe-style 21 + post. 22 + 23 + ## Core Story Circle (Xe-flavored) 24 + 25 + 1. **Normal World / Context** 26 + - Open with a concrete scene, historical analogy, or personal memory. 27 + - Establish the baseline expectations that will be challenged. 28 + - Example pattern: long opening paragraph that frames a craft, product, or 29 + lived experience. 30 + 31 + 2. **Need / Tension** 32 + - Name the discomfort, contradiction, or loss. 33 + - Use strong, direct statements or rhetorical questions. 34 + - Make it personal and grounded in real constraints. 35 + 36 + 3. **Go / Crossing the Threshold** 37 + - Shift into the present problem or system you are critiquing. 38 + - Make a clear, opinionated claim about what changed. 39 + - Introduce the technical stakes or market incentives. 40 + 41 + 4. **Search / Escalation** 42 + - Walk through evidence, examples, and tradeoffs. 43 + - Use concrete details: tooling, metrics, screenshots, quotes, or links. 44 + - Mix long explanation with short emphasis lines. 45 + 46 + 5. **Find / The Core Insight** 47 + - Reveal the central insight, often a critique of incentives or a design 48 + failure. 49 + - Keep it blunt and memorable. 50 + - Anchor it in values: craft, usability, security, or human cost. 51 + 52 + 6. **Take / Consequences** 53 + - Show what the insight costs: time, safety, craft, trust, people. 54 + - Include personal stakes and admissions. 55 + - Use character dialogue if it sharpens the point. 56 + 57 + 7. **Return / Proposed Path** 58 + - Offer a pragmatic approach, even if partial. 59 + - Explain tradeoffs and why the plan is "good enough". 60 + - Give readers steps or criteria for decisions. 61 + 62 + 8. **Change / New Baseline** 63 + - End with forward momentum or a sober question. 64 + - Reconnect to the opening frame. 65 + - Keep it honest: no false optimism. 66 + 67 + ## Post-Specific Story Maps 68 + 69 + ### Rolling the ladder up behind us 70 + 71 + - **Context:** Historical analogy about cloth and the Luddites. 72 + - **Tension:** Craft dies when expertise is not renewed. 73 + - **Threshold:** Industry only hires seniors, avoids training. 74 + - **Escalation:** AI "vibe coding" as short-term win, long-term rot. 75 + - **Insight:** The problem is incentives and ownership, not tools. 76 + - **Consequences:** Human cost, loss of craft, social harm. 77 + - **Return:** Call for care in deployment and respect for craft. 78 + - **Change:** A warning and a demand for better outcomes. 79 + 80 + ### They squandered the holy grail 81 + 82 + - **Context:** Personal Apple history and the "bicycles for the mind" vision. 83 + - **Tension:** Apple Intelligence promised a transformative leap. 84 + - **Threshold:** Private Cloud Compute drops a radical security model. 85 + - **Escalation:** Feature-by-feature analysis with concrete examples. 86 + - **Insight:** They had the holy grail of trusted compute and wasted it. 87 + - **Consequences:** Trust erosion, user harm, unusable features. 88 + - **Return:** Identify what would have mattered and why. 89 + - **Change:** A lament and a call to value real craft over hype. 90 + 91 + ### Building native packages is complicated 92 + 93 + - **Context:** Anubis explodes in popularity; users want native packages. 94 + - **Tension:** "Just build a tarball" hides real complexity. 95 + - **Threshold:** Threat model and security posture are made explicit. 96 + - **Escalation:** Detailed constraints, risks, and UX tradeoffs. 97 + - **Insight:** Packaging must preserve trust and be distribution-agnostic. 98 + - **Consequences:** Burnout risk, security failures, support debt. 99 + - **Return:** Scope reduction, pragmatic plan, invite downstream packagers. 100 + - **Change:** Clear path forward without pretending it's easy. 101 + 102 + ### Anything can be a message queue if you use it wrongly enough 103 + 104 + - **Context:** Satirical warning box and a big, dramatic threat setup. 105 + - **Tension:** Managed NAT Gateway cost pain and cloud billing absurdity. 106 + - **Threshold:** Pivot to a "better" way, then ground it with a safety aside. 107 + - **Escalation:** Long-form technical walkthrough with analogies and dialogue. 108 + - **Insight:** The cursed solution works in theory, but the warning is the 109 + point. 110 + - **Consequences:** You could do it, but you should not; expertise is required. 111 + - **Return:** Re-center on what is actually safe to adopt. 112 + - **Change:** Reader leaves with caution and a concrete mental model. 113 + 114 + ### Automagically assimilating NixOS machines into your Tailnet with Terraform 115 + 116 + - **Context:** Declarative tool mismatch and a promise to bridge it. 117 + - **Tension:** Nix flakes and Terraform do not align cleanly. 118 + - **Threshold:** Commit to a full tutorial with prerequisites. 119 + - **Escalation:** Step-by-step build with concrete commands and config. 120 + - **Insight:** You can glue the worlds together with careful state handling. 121 + - **Consequences:** Complexity is real; credentials and state must be handled 122 + safely. 123 + - **Return:** Deliver a repeatable workflow and expectations. 124 + - **Change:** Reader leaves with a practical implementation path. 125 + 126 + ### Video Compression for Mere Mortals 127 + 128 + - **Context:** Personal need to self-host VTuber streams. 129 + - **Tension:** Storage and bandwidth costs make raw video untenable. 130 + - **Threshold:** Commit to learning compression and sharing the process. 131 + - **Escalation:** Explain compression from first principles with analogies. 132 + - **Insight:** Keyframes and deltas make practical compression possible. 133 + - **Consequences:** Tradeoffs in quality, effort, and infrastructure. 134 + - **Return:** Practical compression approach and measured expectations. 135 + - **Change:** Reader gains a mental model and a usable plan. 136 + 137 + ### Paranoid NixOS Setup 138 + 139 + - **Context:** Most systems can be simple, but some need more paranoia. 140 + - **Tension:** Threat model requires defense-in-depth. 141 + - **Threshold:** Set high-level goals and constraints. 142 + - **Escalation:** Walk through hardening steps with concrete configs. 143 + - **Insight:** Security is layered friction, not absolute safety. 144 + - **Consequences:** Usability costs and operational overhead. 145 + - **Return:** Provide a hardened baseline with rationale. 146 + - **Change:** Reader leaves with a practical, principled stance. 147 + 148 + ### Media I experienced in 2022 149 + 150 + - **Context:** Year-end reflection and catalog of what was played/watched. 151 + - **Tension:** No single "best" can represent the year. 152 + - **Threshold:** Commit to mini-reviews instead of a single winner. 153 + - **Escalation:** Itemized impressions with personal color and ratings. 154 + - **Insight:** The year was defined by variety, not one peak. 155 + - **Consequences:** No neat ranking; focus on lived experience. 156 + - **Return:** A curated list that documents the year honestly. 157 + - **Change:** Reader gets a snapshot of taste and time. 158 + 159 + ### Backfilling Discord forum channels with the power of terrible code 160 + 161 + - **Context:** New community needs a useful forum archive. 162 + - **Tension:** Empty forums feel dead and unhelpful. 163 + - **Threshold:** Frame the task as ETL and commit to the pipeline. 164 + - **Escalation:** Practical steps: permissions, scraping, storage, 165 + transformation. 166 + - **Insight:** Small, careful pipelines beat big, abstract solutions. 167 + - **Consequences:** Privacy and load concerns must be handled explicitly. 168 + - **Return:** Ship the backfill and show how to reuse it. 169 + - **Change:** Readers get a real-world ETL playbook. 170 + 171 + ### I made a simple agent for PR reviews. Don't use it. 172 + 173 + - **Context:** AI review tools are everywhere, so build one. 174 + - **Tension:** Convenience vs reliability and usefulness. 175 + - **Threshold:** Explain the model, the loop, and deployment. 176 + - **Escalation:** Show the prompt structure and tool actions. 177 + - **Insight:** It works, but the limitations are the real story. 178 + - **Consequences:** Hard limits, fragility, and low stakes use only. 179 + - **Return:** Explicit warning not to use it. 180 + - **Change:** Reader understands the tradeoff without hype. 181 + 182 + ### Valve is about to win the console generation 183 + 184 + - **Context:** New Valve hardware lineup announcement. 185 + - **Tension:** Can they avoid the last Steam Machine failure? 186 + - **Threshold:** Break down the lineup and its implications. 187 + - **Escalation:** Specs, ecosystem freedom, and developer upside. 188 + - **Insight:** Openness and tooling make this unusually strong. 189 + - **Consequences:** Price is the only real risk. 190 + - **Return:** Await hands-on and ask for review input. 191 + - **Change:** Reader leaves primed for the outcome. 192 + 193 + ### Immutable by Design: The Deep Tech Behind Tigris Bucket Forking 194 + 195 + - **Context:** Bucket forking explained as a core storage capability. 196 + - **Tension:** Data experimentation is risky without isolation. 197 + - **Threshold:** Shift into the product explanation. 198 + - **Escalation:** Concrete mechanism and benefits. 199 + - **Insight:** Forking makes data workflows safe and fast. 200 + - **Consequences:** Users can experiment without fear. 201 + - **Return:** Direct readers to the full writeup. 202 + - **Change:** Reader recognizes the mental model. 203 + 204 + ### Taking steps to end traffic from abusive cloud providers 205 + 206 + - **Context:** Scraping abuse is escalating. 207 + - **Tension:** Blocking is fragile; abuse reports are leverage. 208 + - **Threshold:** Explain what makes reports effective. 209 + - **Escalation:** Checklist and process details. 210 + - **Insight:** Make abuse the provider's problem. 211 + - **Consequences:** Better enforcement and fewer repeat offenders. 212 + - **Return:** Provide the exact report ingredients. 213 + - **Change:** Reader can act immediately. 214 + 215 + ### Fearless dataset experimentation with bucket forking 216 + 217 + - **Context:** Dataset work needs safe iteration. 218 + - **Tension:** Duplicating data is expensive and slow. 219 + - **Threshold:** Introduce bucket forking as the solution. 220 + - **Escalation:** Example workflows: filtering, captioning, resizing. 221 + - **Insight:** Forks let you branch without heavy storage costs. 222 + - **Consequences:** Faster experimentation and less risk. 223 + - **Return:** Point readers to the capability. 224 + - **Change:** Reader gets the core idea quickly. 225 + 226 + ## Practical Use 227 + 228 + - Use the **Context -> Tension -> Threshold** trio to frame the hook. 229 + - Use **Escalation** to justify the critique with details and evidence. 230 + - Use **Insight -> Consequences** to land the core argument. 231 + - Use **Return -> Change** to end with pragmatic next steps or a sober question.
+131
.claude/skills/xe-writing-style/references/voice-tone.md
··· 1 + # Xe Iaso Voice and Tone Reference 2 + 3 + Detailed voice characteristics for calibrating prose. SKILL.md has the 4 + quick-reference version; this file has the depth. 5 + 6 + ## Core Voice 7 + 8 + Confident, opinionated, and technically authoritative, but human and 9 + approachable. The narrator is present as a real person, not an abstract author. 10 + Strong stances always come with clear qualifiers and tradeoffs. 11 + 12 + ## Vulnerability 13 + 14 + Open about uncertainty, mistakes, burnout, and emotional context. 15 + Self-deprecation builds trust and keeps the raw humanity visible. 16 + 17 + - "I literally have no idea what I am doing wrong." 18 + - "I felt like a dunce." 19 + - "This entire situation sucks." 20 + - "I'm incredibly privileged to be able to work in this industry as it is." 21 + - "I just feel so replaceable at my dayjob." 22 + 23 + ## Opinions with Nuance 24 + 25 + Strong stances coexist with qualifiers. Praise and condemnation appear in the 26 + same post. 27 + 28 + - "This is horrifying." / "To be fair..." 29 + - "It just makes sense." / "I suspect..." 30 + - "I cannot believe that this is a shipped product from Apple." / "I have zero 31 + complaints about Math Notes." 32 + - "I hate that this all makes sense. I hoped that when I started to look into 33 + this that it'd be something so obviously wrong." 34 + 35 + ## Narrative Modes 36 + 37 + ### Technical Walkthroughs 38 + 39 + - Context first, then step-by-step implementation 40 + - Lists and command blocks for executable steps 41 + - Explain tradeoffs and constraints as you go 42 + - Assume readers are peers 43 + 44 + ### Reflective/Critical Essays 45 + 46 + - Personal hook or lived experience first 47 + - Critique systems and incentives directly, including power dynamics 48 + - Acknowledge complexity and limits of certainty 49 + - End with pragmatic takeaways, a blunt reality check, or open questions 50 + 51 + ### Fiction and Mythic Framing 52 + 53 + - Sensory detail and internal monologue to carry emotion 54 + - Italicized inner thoughts for emphasis 55 + - Rhythmic repetition for tension or momentum 56 + - Strong scene framing and clear beats 57 + 58 + ### Moral and Social Framing 59 + 60 + - Center human consequences and lived experience 61 + - Make ethical stakes explicit, not implicit 62 + - Compassion shows up even when critiquing the subject 63 + 64 + ## Sentence Rhythm 65 + 66 + - Short punchy lines alternate with longer technical explanations 67 + - Frequent fragments for emphasis 68 + - Conversational cadence with em dashes and parenthetical asides 69 + - Rhetorical questions for disbelief and engagement 70 + - Repetition for cadence when needed ("Left. Right. Left.") 71 + 72 + ## Vocabulary 73 + 74 + ### Intensifiers 75 + 76 + "literally", "honestly", "actually", "really", "fundamentally", "genuinely" 77 + 78 + ### Casual Markers 79 + 80 + "kinda", "super", "way", "lol", "hilariously", "chonky", "holy crap", "good 81 + lord" 82 + 83 + ### Xe-isms 84 + 85 + "cursed", "napkin math", "accursed abomination", "Just Works™", "bitrot fairy", 86 + "github hellthreads", "vibe coding", "torment nexus" 87 + 88 + ### Emphasis Patterns 89 + 90 + - _italics_ for stress and inner thoughts 91 + - Sparing **bold** for key terms 92 + - ALL CAPS very rarely, only for extreme emphasis 93 + - ™ symbol for ironic branding: "Just Works™", "The Bit™️", "The Deep Lore™️" 94 + 95 + ## References and Citations 96 + 97 + - Anime, games, sci-fi references woven naturally into arguments 98 + - Dense inline linking to sources, docs, prior art 99 + - Industry commentary and critiques of specific companies/products by name 100 + - External links as evidence, not decoration 101 + 102 + ## Values Embedded in Style 103 + 104 + - Transparency over polish 105 + - Community-oriented: credit others, solicit input 106 + - Practical solutions over idealism 107 + - Anti-corporate skepticism and independence 108 + - "Good enough" philosophy and iterative problem-solving 109 + - Compassion for people impacted by the systems being critiqued 110 + 111 + ## What to Avoid 112 + 113 + - Corporate or marketing tone 114 + - False certainty or hiding uncertainty 115 + - Overly formal academic voice 116 + - Gatekeeping or condescension 117 + - Pretending to have all the answers 118 + - Hiding mistakes 119 + - Forced humor or hype 120 + - Over-explaining basics to a peer audience 121 + 122 + ## Quick Principles 123 + 124 + 1. Write for a peer, not a student 125 + 2. Show the journey, not just the outcome 126 + 3. Be honest about uncertainty 127 + 4. Use concrete examples and real numbers 128 + 5. Balance expertise with humility 129 + 6. Prefer clarity and context over brevity 130 + 7. Let humor be self-aware and purposeful 131 + 8. End with forward momentum or open questions