Revert "papers/arxiv-latency: factor out Parag — depersonalize for production"

+10 -10

2 changed files

expand all

papers

arxiv-latency

latency-cards.tex

latency.tex

+4 -4

papers/arxiv-latency/latency-cards.tex

··· 97 97 % ============================================================ 98 98 \section{Introduction} 99 99 100 - An IRQ --- an \emph{interrupt request} --- is the hardware equivalent of someone tapping the CPU on the shoulder mid-sentence. Every modern CPU is, by default, ignoring almost everything. It runs whatever the kernel's scheduler last placed in front of it, and only stops when a wire is pulled. The keyboard controller pulls a wire when a key changes; the sound card pulls a wire when its DMA buffer has shifted enough samples out the speaker amplifier that it is about to run dry. The kernel responds with an \emph{interrupt service routine} (ISR), reads the device's status, and either consumes the event or wakes a userspace process that was waiting. That, in five sentences, is the entire mechanism by which a press of the spacebar eventually becomes a sound: a piece of physics (the key bottoming out closes a circuit) becomes a USB transaction, becomes an IRQ, becomes a wakeup, becomes a synth voice, becomes another DMA buffer, becomes a vibration in the air. 100 + Parag, an IRQ --- an \emph{interrupt request} --- is the hardware equivalent of someone tapping the CPU on the shoulder mid-sentence. Every modern CPU is, by default, ignoring almost everything. It runs whatever the kernel's scheduler last placed in front of it, and only stops when a wire is pulled. The keyboard controller pulls a wire when a key changes; the sound card pulls a wire when its DMA buffer has shifted enough samples out the speaker amplifier that it is about to run dry. The kernel responds with an \emph{interrupt service routine} (ISR), reads the device's status, and either consumes the event or wakes a userspace process that was waiting. That, in five sentences, is the entire mechanism by which a press of the spacebar eventually becomes a sound: a piece of physics (the key bottoming out closes a circuit) becomes a USB transaction, becomes an IRQ, becomes a wakeup, becomes a synth voice, becomes another DMA buffer, becomes a vibration in the air. 101 101 102 - The interesting question is: how long does that path take, and how much of the time is spent on things we cannot avoid versus things we have chosen to put in the way? This paper answers that question for \acos{}~\citep{scudder2026os}, the bare-metal Linux build that powers the chromatic-keyboard piece \texttt{notepat}~\citep{scudder2026notepat}. The system was designed from the start as a musical instrument: the relevant performance metric is not throughput but the latency of a single keypress reaching the DAC, and the jitter on that latency. Wessel and Wright argued in 2002 that any tool intended for intimate musical control should keep this number below 10\,ms~\citep{wessel2002problems}; McPherson and colleagues showed in 2016 that most general-purpose computing stacks are nowhere close~\citep{mcpherson2016action}, motivating the dedicated Bela platform~\citep{jack2018sub,bela}. \acos{} is not Bela. It runs on whatever surplus laptop you flash it onto. But its design intent is the same: minimize the layers between the key and the DAC, and put visible numbers next to each one that remains. 102 + The interesting question, the one you actually asked, is: how long does that path take, and how much of the time is spent on things we cannot avoid versus things we have chosen to put in the way? This paper answers that question for \acos{}~\citep{scudder2026os}, the bare-metal Linux build that powers the chromatic-keyboard piece \texttt{notepat}~\citep{scudder2026notepat}. The system was designed from the start as a musical instrument: the relevant performance metric is not throughput but the latency of a single keypress reaching the DAC, and the jitter on that latency. Wessel and Wright argued in 2002 that any tool intended for intimate musical control should keep this number below 10\,ms~\citep{wessel2002problems}; McPherson and colleagues showed in 2016 that most general-purpose computing stacks are nowhere close~\citep{mcpherson2016action}, motivating the dedicated Bela platform~\citep{jack2018sub,bela}. \acos{} is not Bela. It runs on whatever surplus laptop you flash it onto. But its design intent is the same: minimize the layers between the key and the DAC, and put visible numbers next to each one that remains. 103 103 104 104 The instrument did not start on bare metal. It started in a browser tab. \texttt{notepat} shipped first as a web piece, and the entire impetus to build a custom OS came from a specific measurement. On 2026-01-09 the \texttt{notepat} latency-testing harness was added (commit \texttt{001a0769a}); on 2026-01-29 a target was set --- \texttt{window.\_\_acLatencyHint = 0.003} (3\,ms, commit \texttt{22675462f}); on 2026-02-15 the harness measured a mean of \textbf{417.28\,ms} (commit \texttt{578154860}) against that target. That is the moment a custom OS became inevitable: the work to remove 14$\times$ of overhead from a Web Audio + AudioWorklet + browser-event-loop stack would exceed the work to write a Linux runtime that talks ALSA directly. The kiosk-OS scaffolding (FFOS, then FedAC, then ac-native) lands in the next two weeks; the ALSA audio engine itself appears on 2026-03-05 in commit \texttt{a94cc50ea}. By the time this paper is written, the same \texttt{notepat} that was at 417\,ms in a browser is at 3--4\,ms on a flashed surplus ThinkPad. 105 105 ··· 345 345 346 346 \section{Conclusion} 347 347 348 - An interrupt is a wire that the CPU has agreed to listen to. The keyboard pulls one when a key changes. The sound card pulls one when its DMA buffer is about to run dry. Between those two events the kernel does roughly 100\,$\mu$s of work and userspace does another 100\,$\mu$s --- those are not the latency. The latency is the buffers we keep in flight to absorb scheduling jitter, plus whatever firmware sits between the OS and the silicon, plus whatever userspace abstractions someone added before us. On a ThinkPad with HDA-direct audio, \acos{} is at $\sim$3--4\,ms key-to-DAC, against a $\sim$2\,ms physics floor. On an HP Chromebook G7 with SOF, the floor itself is $\sim$80\,ms. The same instrument was at 417\,ms in a browser earlier this year. The native OS exists because that number could not be moved by working inside the browser; it can be moved by leaving. 348 + So, Parag: an interrupt is a wire that the CPU has agreed to listen to. The keyboard pulls one when a key changes. The sound card pulls one when its DMA buffer is about to run dry. Between those two events the kernel does roughly 100\,$\mu$s of work and userspace does another 100\,$\mu$s --- those are not the latency. The latency is the buffers we keep in flight to absorb scheduling jitter, plus whatever firmware sits between the OS and the silicon, plus whatever userspace abstractions someone added before us. On a ThinkPad with HDA-direct audio, \acos{} is at $\sim$3--4\,ms key-to-DAC, against a $\sim$2\,ms physics floor. On an HP Chromebook G7 with SOF, the floor itself is $\sim$80\,ms. The same instrument was at 417\,ms in a browser earlier this year. The native OS exists because that number could not be moved by working inside the browser; it can be moved by leaving. 349 349 350 350 The right question for an instrument is never ``how low can the average go,'' it is ``how predictable is the worst case.'' That is what the recent work has been about. The audible thresholds (10\,ms for intimate musical control, 20\,ms before the action-sound bond starts to break) are met on the hardware we recommend. They are not yet met on every laptop in a thrift bin --- not because the kernel cannot deliver, but because the firmware sometimes will not let it. 351 351 352 352 \vspace{0.5em} 353 - \noindent\textbf{Acknowledgments.} The numbers in this paper are derived from the public commit history of the \texttt{aesthetic-computer} repository as of April 2026. 353 + \noindent\textbf{Acknowledgments.} For Parag, who asked the question that made this paper worth writing. The numbers in this paper are derived from the public commit history of the \texttt{aesthetic-computer} repository as of April 2026. 354 354 355 355 \vspace{0.5em} 356 356 \noindent\textbf{ORCID:} \href{https://orcid.org/0009-0007-4460-4913}{0009-0007-4460-4913}

+6 -6

papers/arxiv-latency/latency.tex

··· 149 149 \vspace{0.2em} 150 150 {\aclight\fontsize{11pt}{13pt}\selectfont\color{acpink} Input and Audio Latency in AC Native OS}\par 151 151 \vspace{0.3em} 152 - {\aclight\fontsize{9pt}{11pt}\selectfont\color{acgray} On what an interrupt is and how close to physics we already are}\par 152 + {\aclight\fontsize{9pt}{11pt}\selectfont\color{acgray} A letter for Parag, on what an interrupt is and how close to physics we already are}\par 153 153 \vspace{0.6em} 154 154 {\normalsize\href{https://prompt.ac/@jeffrey}{@jeffrey}}\par 155 155 {\small\color{acgray} Aesthetic.Computer}\par ··· 168 168 169 169 \begin{quote} 170 170 \small\noindent\textbf{Abstract.} 171 - This paper walks the keypress-to-sound path inside \acos{} from the keyboard's USB host controller IRQ down to the audio codec's DMA engine, asking what an IRQ actually is and whether stacking display servers makes a computer feel slower. I quantify each layer the signal must cross, compare the values measured in \acos{} today against the theoretical floor set by physics and minimum kernel work, and trace the commit-by-commit history of how the chromatic keyboard piece \texttt{notepat} arrived at its current numbers --- a story that begins not on bare metal but in the browser, where on 2026-02-15 a measured \textbf{417\,ms} keyboard-to-sound latency in the web build of \texttt{notepat} (against a 30\,ms target) made it clear that no amount of profiling inside Chrome would close the gap. \acos{} was the response. It now runs ALSA at a 192-frame period at 192\,kHz ($\approx$1\,ms hardware turnaround) on HDA-direct codecs, falling back to 10--20\,ms periods on Sound Open Firmware (SOF) platforms whose DAPM models cannot tolerate sub-period scheduling pressure. Wayland was tried (via the \texttt{cage} compositor on a NixOS prototype) and removed --- the bare-metal build runs DRM-direct on \texttt{evdev}, eliminating the entire \texttt{evdev}\,$\rightarrow$\,\texttt{libinput}\,$\rightarrow$\,\texttt{cage}\,$\rightarrow$\,\texttt{wl\_pointer} input chain because each abstraction layer either adds a context switch (\,$\mu$s, harmless) or a buffer turnaround (ms or one frame, audible). The realistic floor is approximately 2\,ms key-to-DAC; we are at roughly 3--4\,ms on HDA hardware and 12--22\,ms on SOF. The macOS sibling port makes the thesis even sharper: switching from SDL3's audio stream to a direct CoreAudio backend dropped the measured median from 6.47\,ms to 0.65\,ms on the same hardware --- a 10$\times$ reduction that no buffer-size change could reach, because the bottleneck was a layer we had not noticed adding. The remaining gap is not algorithmic --- it is the cost of supporting hardware whose firmware demands buffering we do not need. 171 + This paper, inspired by questions from Parag about what an IRQ is and whether stacking display servers makes a computer feel slower, walks the keypress-to-sound path inside \acos{} from the keyboard's USB host controller IRQ down to the audio codec's DMA engine. I quantify each layer the signal must cross, compare the values measured in \acos{} today against the theoretical floor set by physics and minimum kernel work, and trace the commit-by-commit history of how the chromatic keyboard piece \texttt{notepat} arrived at its current numbers --- a story that begins not on bare metal but in the browser, where on 2026-02-15 a measured \textbf{417\,ms} keyboard-to-sound latency in the web build of \texttt{notepat} (against a 30\,ms target) made it clear that no amount of profiling inside Chrome would close the gap. \acos{} was the response. It now runs ALSA at a 192-frame period at 192\,kHz ($\approx$1\,ms hardware turnaround) on HDA-direct codecs, falling back to 10--20\,ms periods on Sound Open Firmware (SOF) platforms whose DAPM models cannot tolerate sub-period scheduling pressure. Wayland was tried (via the \texttt{cage} compositor on a NixOS prototype) and removed --- the bare-metal build runs DRM-direct on \texttt{evdev}, eliminating the entire \texttt{evdev}\,$\rightarrow$\,\texttt{libinput}\,$\rightarrow$\,\texttt{cage}\,$\rightarrow$\,\texttt{wl\_pointer} input chain because each abstraction layer either adds a context switch (\,$\mu$s, harmless) or a buffer turnaround (ms or one frame, audible). The realistic floor is approximately 2\,ms key-to-DAC; we are at roughly 3--4\,ms on HDA hardware and 12--22\,ms on SOF. The macOS sibling port makes the thesis even sharper: switching from SDL3's audio stream to a direct CoreAudio backend dropped the measured median from 6.47\,ms to 0.65\,ms on the same hardware --- a 10$\times$ reduction that no buffer-size change could reach, because the bottleneck was a layer we had not noticed adding. The remaining gap is not algorithmic --- it is the cost of supporting hardware whose firmware demands buffering we do not need. 172 172 \end{quote} 173 173 \vspace{0.5em} 174 174 }] ··· 177 177 178 178 \section{Introduction} 179 179 180 - An IRQ --- an \emph{interrupt request} --- is the hardware equivalent of someone tapping the CPU on the shoulder mid-sentence. Every modern CPU is, by default, ignoring almost everything. It runs whatever the kernel's scheduler last placed in front of it, and only stops when a wire is pulled. The keyboard controller pulls a wire when a key changes; the sound card pulls a wire when its DMA buffer has shifted enough samples out the speaker amplifier that it is about to run dry. The kernel responds with an \emph{interrupt service routine} (ISR), reads the device's status, and either consumes the event or wakes a userspace process that was waiting. That, in five sentences, is the entire mechanism by which a press of the spacebar eventually becomes a sound: a piece of physics (the key bottoming out closes a circuit) becomes a USB transaction, becomes an IRQ, becomes a wakeup, becomes a synth voice, becomes another DMA buffer, becomes a vibration in the air. 180 + Parag, an IRQ --- an \emph{interrupt request} --- is the hardware equivalent of someone tapping the CPU on the shoulder mid-sentence. Every modern CPU is, by default, ignoring almost everything. It runs whatever the kernel's scheduler last placed in front of it, and only stops when a wire is pulled. The keyboard controller pulls a wire when a key changes; the sound card pulls a wire when its DMA buffer has shifted enough samples out the speaker amplifier that it is about to run dry. The kernel responds with an \emph{interrupt service routine} (ISR), reads the device's status, and either consumes the event or wakes a userspace process that was waiting. That, in five sentences, is the entire mechanism by which a press of the spacebar eventually becomes a sound: a piece of physics (the key bottoming out closes a circuit) becomes a USB transaction, becomes an IRQ, becomes a wakeup, becomes a synth voice, becomes another DMA buffer, becomes a vibration in the air. 181 181 182 - The interesting question is: how long does that path take, and how much of the time is spent on things we cannot avoid versus things we have chosen to put in the way? This paper answers that question for \acos{}~\citep{scudder2026os}, the bare-metal Linux build that powers the chromatic-keyboard piece \texttt{notepat}~\citep{scudder2026notepat}. The system was designed from the start as a musical instrument: the relevant performance metric is not throughput but the latency of a single keypress reaching the DAC, and the jitter on that latency. Wessel and Wright argued in 2002 that any tool intended for intimate musical control should keep this number below 10\,ms~\citep{wessel2002problems}; McPherson and colleagues showed in 2016 that most general-purpose computing stacks are nowhere close~\citep{mcpherson2016action}, motivating the dedicated Bela platform~\citep{jack2018sub,bela}. \acos{} is not Bela. It runs on whatever surplus laptop you flash it onto. But its design intent is the same: minimize the layers between the key and the DAC, and put visible numbers next to each one that remains. 182 + The interesting question, the one you actually asked, is: how long does that path take, and how much of the time is spent on things we cannot avoid versus things we have chosen to put in the way? This paper answers that question for \acos{}~\citep{scudder2026os}, the bare-metal Linux build that powers the chromatic-keyboard piece \texttt{notepat}~\citep{scudder2026notepat}. The system was designed from the start as a musical instrument: the relevant performance metric is not throughput but the latency of a single keypress reaching the DAC, and the jitter on that latency. Wessel and Wright argued in 2002 that any tool intended for intimate musical control should keep this number below 10\,ms~\citep{wessel2002problems}; McPherson and colleagues showed in 2016 that most general-purpose computing stacks are nowhere close~\citep{mcpherson2016action}, motivating the dedicated Bela platform~\citep{jack2018sub,bela}. \acos{} is not Bela. It runs on whatever surplus laptop you flash it onto. But its design intent is the same: minimize the layers between the key and the DAC, and put visible numbers next to each one that remains. 183 183 184 184 The instrument did not start on bare metal. It started in a browser tab. \texttt{notepat} shipped first as a web piece, and the entire impetus to build a custom OS came from a specific measurement. On 2026-01-09 the \texttt{notepat} latency-testing harness was added (commit \texttt{001a0769a}); on 2026-01-29 a target was set --- \texttt{window.\_\_acLatencyHint = 0.003} (3\,ms, commit \texttt{22675462f}); on 2026-02-15 the harness measured a mean of \textbf{417.28\,ms} (commit \texttt{578154860}) against that target. That is the moment a custom OS became inevitable: the work to remove 14$\times$ of overhead from a Web Audio + AudioWorklet + browser-event-loop stack would exceed the work to write a Linux runtime that talks ALSA directly. The kiosk-OS scaffolding (FFOS, then FedAC, then ac-native) lands in the next two weeks; the ALSA audio engine itself appears on 2026-03-05 in commit \texttt{a94cc50ea}. By the time this paper is written, the same \texttt{notepat} that was at 417\,ms in a browser is at 3--4\,ms on a flashed surplus ThinkPad. 185 185 ··· 425 425 426 426 \section{Conclusion} 427 427 428 - An interrupt is a wire that the CPU has agreed to listen to. The keyboard pulls one when a key changes. The sound card pulls one when its DMA buffer is about to run dry. Between those two events the kernel does roughly 100\,$\mu$s of work and userspace does another 100\,$\mu$s --- those are not the latency. The latency is the buffers we keep in flight to absorb scheduling jitter, plus whatever firmware sits between the OS and the silicon, plus whatever userspace abstractions someone added before us. On a ThinkPad with HDA-direct audio, \acos{} is at $\sim$3--4\,ms key-to-DAC, against a $\sim$2\,ms physics floor. On an HP Chromebook G7 with SOF, the floor itself is $\sim$80\,ms. The same instrument was at 417\,ms in a browser earlier this year. The native OS exists because that number could not be moved by working inside the browser; it can be moved by leaving. 428 + So, Parag: an interrupt is a wire that the CPU has agreed to listen to. The keyboard pulls one when a key changes. The sound card pulls one when its DMA buffer is about to run dry. Between those two events the kernel does roughly 100\,$\mu$s of work and userspace does another 100\,$\mu$s --- those are not the latency. The latency is the buffers we keep in flight to absorb scheduling jitter, plus whatever firmware sits between the OS and the silicon, plus whatever userspace abstractions someone added before us. On a ThinkPad with HDA-direct audio, \acos{} is at $\sim$3--4\,ms key-to-DAC, against a $\sim$2\,ms physics floor. On an HP Chromebook G7 with SOF, the floor itself is $\sim$80\,ms. The same instrument was at 417\,ms in a browser earlier this year. The native OS exists because that number could not be moved by working inside the browser; it can be moved by leaving. 429 429 430 430 The right question for an instrument is never ``how low can the average go,'' it is ``how predictable is the worst case.'' That is what the recent work has been about. The audible thresholds (10\,ms for intimate musical control, 20\,ms before the action-sound bond starts to break) are met on the hardware we recommend. They are not yet met on every laptop in a thrift bin --- not because the kernel cannot deliver, but because the firmware sometimes will not let it. 431 431 432 432 \vspace{0.5em} 433 - \noindent\textbf{Acknowledgments.} The numbers in this paper are derived from the public commit history of the \texttt{aesthetic-computer} repository as of April 2026. 433 + \noindent\textbf{Acknowledgments.} For Parag, who asked the question that made this paper worth writing. The numbers in this paper are derived from the public commit history of the \texttt{aesthetic-computer} repository as of April 2026. 434 434 435 435 \vspace{0.5em} 436 436 \noindent\textbf{ORCID:} \href{https://orcid.org/0009-0007-4460-4913}{0009-0007-4460-4913}

Configure Feed

Configure Feed