The code and data behind xeiaso.net
5
fork

Configure Feed

Select the types of activity you want to include in your feed.

feat(blog): first look at the DGX Spark

Signed-off-by: Xe Iaso <me@xeiaso.net>

Xe Iaso c51f311c 43d641a1

+249
+88
lume/src/_components/Admonition.jsx
··· 1 + // --- Helper for Icons --- 2 + // Using inline SVGs to keep everything in one file. 3 + const InfoIcon = ({ className = "w-6 h-6" }) => ( 4 + <svg className={className} xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"> 5 + <circle cx="12" cy="12" r="10" /> 6 + <line x1="12" y1="16" x2="12" y2="12" /> 7 + <line x1="12" y1="8" x2="12.01" y2="8" /> 8 + </svg> 9 + ); 10 + 11 + const WarningIcon = ({ className = "w-6 h-6" }) => ( 12 + <svg className={className} xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"> 13 + <path d="m21.73 18-8-14a2 2 0 0 0-3.46 0l-8 14A2 2 0 0 0 4 21h16a2 2 0 0 0 1.73-3Z" /> 14 + <line x1="12" y1="9" x2="12" y2="13" /> 15 + <line x1="12" y1="17" x2="12.01" y2="17" /> 16 + </svg> 17 + ); 18 + 19 + const TipIcon = ({ className = "w-6 h-6" }) => ( 20 + <svg className={className} xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"> 21 + <path d="M12 22c5.523 0 10-4.477 10-10S17.523 2 12 2 2 6.477 2 12s4.477 10 10 10z" /> 22 + <path d="m9 12 2 2 4-4" /> 23 + </svg> 24 + ); 25 + 26 + const NoteIcon = ({ className = "w-6 h-6" }) => ( 27 + <svg className={className} xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="currentColor" strokeWidth="2" strokeLinecap="round" strokeLinejoin="round"> 28 + <path d="M4 22h14a2 2 0 0 0 2-2V7.5L14.5 2H6a2 2 0 0 0-2 2v4" /> 29 + <polyline points="14 2 14 8 20 8" /> 30 + <path d="M2 15h10" /> 31 + <path d="M2 19h5" /> 32 + </svg> 33 + ); 34 + 35 + 36 + // --- Admonition Component --- 37 + // This component displays a styled block for notes, warnings, tips, etc. 38 + const Admonition = ({ type = 'note', title, children }) => { 39 + const styles = { 40 + note: { 41 + bgColor: 'bg-blue-50 dark:bg-blue-900/20', 42 + borderColor: 'border-blue-200 dark:border-blue-500/30', 43 + iconColor: 'text-blue-500', 44 + titleColor: 'text-blue-800 dark:text-blue-300', 45 + icon: <NoteIcon />, 46 + }, 47 + warning: { 48 + bgColor: 'bg-red-50 dark:bg-red-900/20', 49 + borderColor: 'border-red-200 dark:border-red-500/30', 50 + iconColor: 'text-red-500', 51 + titleColor: 'text-red-800 dark:text-red-300', 52 + icon: <WarningIcon />, 53 + }, 54 + tip: { 55 + bgColor: 'bg-green-50 dark:bg-green-900/20', 56 + borderColor: 'border-green-200 dark:border-green-500/30', 57 + iconColor: 'text-green-600', 58 + titleColor: 'text-green-800 dark:text-green-300', 59 + icon: <TipIcon />, 60 + }, 61 + info: { 62 + bgColor: 'bg-purple-50 dark:bg-purple-900/20', 63 + borderColor: 'border-purple-200 dark:border-purple-500/30', 64 + iconColor: 'text-purple-500', 65 + titleColor: 'text-purple-800 dark:text-purple-300', 66 + icon: <InfoIcon />, 67 + } 68 + }; 69 + 70 + const currentStyle = styles[type] || styles.note; 71 + const defaultTitle = title || type.charAt(0).toUpperCase() + type.slice(1); 72 + 73 + return ( 74 + <div className={`not-prose mx-auto my-6 flex gap-4 rounded-lg border p-4 max-w-lg ${currentStyle.bgColor} ${currentStyle.borderColor}`}> 75 + <div className={`mt-1 flex-shrink-0 ${currentStyle.iconColor}`}> 76 + {currentStyle.icon} 77 + </div> 78 + <div className="flex-grow"> 79 + <h3 className={`text-lg font-semibold ${currentStyle.titleColor}`}>{defaultTitle}</h3> 80 + <div className="prose prose-sm dark:prose-invert max-w-none text-gray-700 dark:text-gray-300 mt-2"> 81 + {children} 82 + </div> 83 + </div> 84 + </div> 85 + ); 86 + }; 87 + 88 + export default Admonition;
+161
lume/src/blog/2025/dgx-spark-first-look.mdx
··· 1 + --- 2 + title: "First look at the DGX Spark" 3 + desc: "A local supercomputer between the size of a Mac mini and a Mac mini." 4 + date: 2025-10-14 5 + hero: 6 + ai: "iPhone 17 Pro Max" 7 + file: "dgx-spark" 8 + prompt: "A golden NVIDIA DGX Spark device on a desk with a Steam Deck OLED for scale." 9 + social: false 10 + --- 11 + 12 + import Admonition from "../../_components/Admonition.jsx"; 13 + 14 + <Admonition type="note" title="Disclaimer"> 15 + I'm considering this post as a sponsored post. I was not paid by NVIDIA to 16 + work on this, but I did receive a DGX Spark from them pre-release and have 17 + been dilligently testing it and filing bugs. 18 + </Admonition> 19 + 20 + I've had access to the [NVIDIA DGX Spark](https://www.nvidia.com/en-us/products/workstations/dgx-spark/) for over a month now. Today I'm gonna cover my first impressions and let you know what I've been up to with it. 21 + 22 + In a nutshell, this thing is a beast. It's one of the most powerful devices in my house and in a pinch I'd be okay with using it as my primary workstation. It's got a mix of a CPU that's got enough punch to do software development with a GPU that's in that sweet spot between consumer and datacenter tier. Not to mention 128Gi of ram. When I've been using this thing, the main limit is my imagination…and my poor understanding of Python environment management. 23 + 24 + I think that it's best to understand the DGX Spark as a devkit for their [NVIDIA Grace Datacentre processors](https://www.nvidia.com/en-us/data-center/grace-cpu/). It's incredibly powerful for what it is, it's a device that can fit on your desk and run AI models right there. 25 + 26 + <Picture 27 + path="blog/2025/dgx-spark-first-look/dgx-spark-desktop" 28 + desc="A DGX Spark on top of a desk with typical computer things around it such as a laptop, a coffee mug, a keyboard, and a mouse." 29 + /> 30 + 31 + The DGX Spark is tiny. It's about as wide as the screen of a Steam Deck OLED, or about halfway between the size of a Mac mini M1 and a Mac mini M4. 32 + 33 + This thing is also stupidly power efficient. I've been unable to cause my office to get warm in a way that is attributed to the DGX Spark alone. On average rendering Final Fantasy 14 in one of the major player hub areas ends up making my tower use more power than the DGX Spark does while doing AI finetuning. I'll talk more about this in the future. 34 + 35 + One of the most interesting things about this device is that it's got an Arm chip, CUDA, and unified RAM. This combination means that in practice you need to compile all of the Python packages you're using from source. Pip usually handles this well enough, but it does mean that for many Python packages it will take longer to install from source than it will on an X86 system. I assume this will be ironed out as the ecosystem matures. 36 + 37 + The power efficiency can't be overstated though. I've tried to make my office warm using the DGX Spark and I have failed. I'm seeing it pull a maximum of 70 watts. 38 + 39 + I get about 30-40 tokens per second with [gpt-oss:120b](https://ollama.com/library/gpt-oss:120b): 40 + 41 + ``` 42 + $ ollama version 43 + 44 + $ ollama run \ 45 + --nowordwrap \ 46 + --verbose \ 47 + xe/mimi:gpt-oss-120b \ 48 + "Summarize this post: $(cat 2025/rolling-ladder-behind-us.mdx)" 49 + ``` 50 + 51 + <ConvP> 52 + <Conv name="Mimi" mood="think"> 53 + The user asks: "Summarize this post: --- ...". So we need to provide a summary. Follow developer instructions: be polite, bubbly, lowercase when possible unless serious. No JSON. Summarize the content. The post is long, covering themes: loss of craft, historical analogy with weavers and Luddites, decline of senior titles, AI replacing expertise, vibe coding, security concerns, AI's impact on art, etc. Provide concise summary. Use informal tone, lowercase. End without period unless serious. Probably okay. 54 + </Conv> 55 + <Conv name="Mimi" mood="happy"> 56 + this post is a sprawling meditation on how we keep “rolling the ladder up behind us” – building a future that erodes the very expertise that made it possible. it starts with a historic look at cloth‑making: skilled weavers were displaced by power looms, turning a durable, treasured craft into cheap, disposable fast‑fashion. the author draws a parallel to today’s software world, warning that we’re replacing senior developers and deep craft with hype‑driven AI tools and “vibe coding” platforms that promise instant results but degrade long‑term quality. 57 + 58 + key points: 59 + 60 + - **craft is disappearing** – both in weaving and coding, the knowledge of masters is vanishing, leaving only fragmented R&D notes or AI‑generated shortcuts. 61 + - **senior titles are at risk** – companies favor hiring senior talent without nurturing the next generation, so the pool of true “seniors” will run out. 62 + - **AI as a double‑edged sword** – generative tools can reduce drudgery but are being marketed as the next industrial revolution while actually shifting value to owners and creating insecure, low‑quality products. 63 + - **vibe coding & AI assistants** – slick UX masks the fact that many tools are subscription traps, security hazards, and can erode programmers’ skills. 64 + - **artistic impact** – similar to how AI floods art spaces with cheap, low‑effort outputs, software development risks becoming a flood of “good enough” code. 65 + - **security concerns** – model‑context‑protocol servers can expose secrets and run unchecked code, highlighting the need for sandboxed, capability‑based designs. 66 + - **broader societal worry** – the author (also the CEO of a small AI‑security startup) sees a winner‑take‑all capitalism fueled by AI, with the cost falling on workers, artists, and even the environment. 67 + 68 + the piece ends with a bleak outlook: without deliberate effort to preserve and pass on true craft, future generations will inherit ladders that only go up for the privileged few, leaving the rest scrambling. it’s a call to value high‑quality, human‑made work over quick, AI‑generated shortcuts. 69 + 70 + </Conv> 71 + </ConvP> 72 + 73 + ``` 74 + total duration: 16.464571893s 75 + load duration: 123.742176ms 76 + prompt eval count: 7237 token(s) 77 + prompt eval duration: 33.491521ms 78 + prompt eval rate: 216084.54 tokens/s 79 + eval count: 567 token(s) 80 + eval duration: 16.063168189s 81 + eval rate: 35.30 tokens/s 82 + ``` 83 + 84 + With flash attention on gpt-oss:120b at 128k context window, it uses about 70Gi of ram: 85 + 86 + ``` 87 + $ ollama ps 88 + NAME ID SIZE PROCESSOR CONTEXT UNTIL 89 + xe/mimi:gpt-oss-120b 81089177a28c 70 GB 100% GPU 131072 29 minutes from now 90 + ``` 91 + 92 + ``` 93 + xe@zohar:~$ nvidia-smi 94 + Mon Oct 13 22:48:25 2025 95 + +-----------------------------------------------------------------------------------------+ 96 + | NVIDIA-SMI 580.95.05 Driver Version: 580.95.05 CUDA Version: 13.0 | 97 + +-----------------------------------------+------------------------+----------------------+ 98 + | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | 99 + | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | 100 + | | | MIG M. | 101 + |=========================================+========================+======================| 102 + | 0 NVIDIA GB10 On | 0000000F:01:00.0 Off | N/A | 103 + | N/A 43C P0 11W / N/A | Not Supported | 0% Default | 104 + | | | N/A | 105 + +-----------------------------------------+------------------------+----------------------+ 106 + 107 + +-----------------------------------------------------------------------------------------+ 108 + | Processes: | 109 + | GPU GI CI PID Type Process name GPU Memory | 110 + | ID ID Usage | 111 + |=========================================================================================| 112 + | 0 N/A N/A 4752 C /usr/local/bin/ollama 66473MiB | 113 + +-----------------------------------------------------------------------------------------+ 114 + ``` 115 + 116 + I assume that the unaccounted 4Gi or so of ram is in the CPU ram overhead of the Ollama model runner process. 117 + 118 + ## What I'm doing with the DGX Spark 119 + 120 + So far I've been using the Spark in place of cloud GPUs for every AI thing I've needed to do at work. In general, I haven't really noticed any differences between the GPU in the cloud and the Spark on my home network. The only real rough edge is that I need to use this one blessed NVIDIA authored docker image to run iPython notebooks. It's easy enough though. Usually my Docker command looks like: 121 + 122 + ``` 123 + docker run \ 124 + --gpus all \ 125 + --net=host \ 126 + --ipc=host \ 127 + --ulimit memlock=-1 \ 128 + --ulimit stack=67108864 \ 129 + -it \ 130 + --rm \ 131 + -v "$HOME/.cache/huggingface:/root/.cache/huggingface" \ 132 + -v "$HOME/.huggingface:/root/.huggingface" \ 133 + -v "$HOME/Code:/workspace/code" \ 134 + -v "$SSH_AUTH_SOCK:$SSH_AUTH_SOCK" \ 135 + -e HF_TOKEN=hf_hunter2hunter2hunter2 \ 136 + -e "SSH_AUTH_SOCK=$SSH_AUTH_SOCK" \ 137 + -e HF_HOME=/root/.cache/huggingface \ 138 + -e HF_HUB_CACHE=/root/.cache/huggingface/hub \ 139 + -e HF_DATASETS_CACHE=/root/.cache/huggingface/datasets \ 140 + nvcr.io/nvidia/pytorch:25.09-py3 141 + ``` 142 + 143 + And then it Just Works™. 144 + 145 + The main thing I've been doing with it is inference of [GPT-OSS 120b](https://ollama.com/library/gpt-oss:120b) via [Ollama](https://ollama.com/). I've been doing latency and power usage testing by setting up a Discord bot and telling people that the goal is to jailbreak the bot into telling you how to make a chocolate cake. Nobody has been able to make my room warm. 146 + 147 + ## What's up next? 148 + 149 + This whole experience has been a bit of a career bucket list item for me. I've never had access to prerelease hardware like this before and being able to see what reviewers have to deal with before things are available to the masses is enlightening. I've ended up filing GPU driver bugs using my tower as a "known good" reference. 150 + 151 + I've been slowly sinking my teeth into learning how AI training actually works using this device to do it. I've mostly been focusing on finetuning GPT-2 and using that to learn the important parts of dataset cleaning, tokenization, and more. Let me know if you want to hear more about that and if you want me to release my practice models. 152 + 153 + At the very least though, here's the things I have in the pipeline that this device enables: 154 + 155 + - Finetuning at home: how to make your own AI models do what you want 156 + - Some rough outlines and/or overviews for how I want to use classical machine learning models to enhance Anubis and do outlier detection 157 + - If I can somehow get Final Fantasy 14 running on it, some benchmarking in comparison to my gaming tower (if you know how to get amd64 games running well on aarch64, DM me!) 158 + 159 + I also plan to make a comprehensive review video. Details to be announced soon. 160 + 161 + I hope this was interesting. Thanks for early access to the device NVIDIA!