papers: submissions ledger + SIGGRAPH Asia tech-papers scaffold

+4 -3

papers/SCORE.md

··· 40 40 41 41 | Paper | Format | PDF | Source | 42 42 |-------|--------|-----|--------| 43 + | Where the Microseconds Go (SIGGRAPH Asia 2026 Tech Papers port) | acmtog (LaTeX, scaffold) | (build pending) | `siggraph-asia-2026-tech/latency-source.tex` | 43 44 | Diagrams from Data: A Penrose Pipeline for AC Illustrations | arXiv (LaTeX) | `arxiv-penrose/penrose.pdf` | `arxiv-penrose/penrose.tex` | 44 45 | Where the Microseconds Go: Input and Audio Latency in AC Native OS | arXiv (LaTeX, 6pp) | `arxiv-latency/latency.pdf` | `arxiv-latency/latency.tex` | 45 46 | Aesthetic Computer Demo (C&C 2026) | ACM Demo (LaTeX) | `cc-demo-2026/demo.pdf` | `cc-demo-2026/demo.tex` | ··· 117 118 118 119 | Venue | Type | Deadline | Conference Date | Status | 119 120 |-------|------|----------|-----------------|--------| 120 - | [ACM C&C 2026](https://cc.acm.org/2026/demos/) | Demos | Apr 16, 2026 | Jul 13–16, London | DRAFT READY (`cc-demo-2026/`) — submission status unconfirmed | 121 - | [ICCC 2026](https://computationalcreativity.net/iccc26/short-papers/) | Short Papers | **Apr 24, 2026 (23:59 AoE)** | Jun 29–Jul 3, Coimbra | GO — 4pp anonymized, EasyChair | 121 + | [ACM C&C 2026](https://cc.acm.org/2026/demos/) | Demos | Apr 16, 2026 | Jul 13–16, London | **DEADLINE PASSED** — verify submission per `SUBMISSIONS.md` | 122 + | [ICCC 2026](https://computationalcreativity.net/iccc26/short-papers/) | Short Papers | Apr 24, 2026 (23:59 AoE) | Jun 29–Jul 3, Coimbra | **DEADLINE PASSED** — verify submission per `SUBMISSIONS.md` | 122 123 | [ICCC 2026](https://computationalcreativity.net/iccc26/) | Early Career Symposium | May 15, 2026 | Jun 29–Jul 3, Coimbra | Candidate | 123 - | [SIGGRAPH Asia 2026](https://asia.siggraph.org/2026/submissions/) | Technical Papers (full) | May 12, 2026 | Dec 1–4, Kuala Lumpur | NEW | 124 + | [SIGGRAPH Asia 2026](https://asia.siggraph.org/2026/submissions/technical-papers/) | Technical Papers (full) | **May 5 form / May 12 paper / May 13 upload** | Dec 1–4, Kuala Lumpur | SCAFFOLD ready (`siggraph-asia-2026-tech/`) — adapting `arxiv-latency` to acmtog double-blind | 124 125 | [SIGGRAPH Asia 2026](https://asia.siggraph.org/2026/submissions/) | Art Papers | Jun 8, 2026 | Dec 1–4, Kuala Lumpur | NEW | 125 126 | [SIGGRAPH Asia 2026](https://asia.siggraph.org/2026/submissions/) | Art Gallery / Emerging Tech / XR | Jun 18, 2026 | Dec 1–4, Kuala Lumpur | NEW | 126 127 | [SIGGRAPH Asia 2026](https://asia.siggraph.org/2026/submissions/) | Posters | Jul 31, 2026 | Dec 1–4, Kuala Lumpur | NEW |

+54

papers/SUBMISSIONS.md

··· 1 + # Submissions Ledger 2 + 3 + A running record of what has actually been sent where. SCORE.md tracks 4 + draft state; this file tracks submission state. Update both when status 5 + changes. 6 + 7 + ## Status legend 8 + 9 + - **DRAFT** — text exists, not submitted 10 + - **READY** — formatted for the venue, not submitted 11 + - **SUBMITTED** — uploaded, awaiting decision (record date + confirmation #) 12 + - **ACCEPTED** / **REJECTED** / **WITHDRAWN** — terminal states 13 + - **MISSED** — deadline passed without submission; track for next year 14 + 15 + ## 2026 16 + 17 + ### Needs immediate verification — deadline already passed 18 + 19 + These two had drafts ready before the deadline but submission was never 20 + confirmed. Check both before deciding what's next. 21 + 22 + #### ACM C&C 2026 Demos — deadline 2026-04-16 23 + 24 + - [ ] Log into the ACM C&C 2026 demos submission portal (PCS or 25 + whatever the track uses) and confirm whether `cc-demo-2026/demo.tex` 26 + was submitted. 27 + - [ ] If submitted: record the submission ID + date here. 28 + - [ ] If not submitted: mark **MISSED** and move target to 2027 row in 29 + SCORE.md. 30 + - [ ] Either way: update the SCORE.md status column from 31 + "DRAFT READY ... submission status unconfirmed" to a definite state. 32 + 33 + #### ICCC 2026 Short Papers — deadline 2026-04-24 (23:59 AoE) 34 + 35 + - [ ] Log into EasyChair and check the ICCC 2026 short-paper track for 36 + a submission of `iccc-kidlisp/iccc.pdf`. 37 + - [ ] If submitted: record the EasyChair paper ID + submission date. 38 + - [ ] If not submitted: mark **MISSED**. ICCC also has an Early Career 39 + Symposium track with a 2026-05-15 deadline that may be a viable 40 + pivot — see SCORE.md. 41 + - [ ] Update SCORE.md status column. 42 + 43 + ### Open windows 44 + 45 + | Venue | Track | Deadline | State | Directory | 46 + |---|---|---|---|---| 47 + | SIGGRAPH Asia 2026 | Tech Papers (full) | 2026-05-05 form / 2026-05-12 paper / 2026-05-13 upload | SCAFFOLD | `siggraph-asia-2026-tech/` | 48 + | ICCC 2026 | Early Career Symposium | 2026-05-15 | CANDIDATE | (none yet) | 49 + | ArtsIT 2026 | Full Papers | 2026-06-01 | NEW | (none yet) | 50 + | SIGGRAPH Asia 2026 | Art Papers | 2026-06-08 | NEW | (none yet) | 51 + | SIGGRAPH Asia 2026 | Art Gallery / ET / XR | 2026-06-18 | NEW | (none yet) | 52 + | SIGGRAPH Asia 2026 | Posters | 2026-07-31 | NEW | (none yet) | 53 + | SIGGRAPH Asia 2026 | Real-Time Live! | 2026-08-07 | NEW | (none yet) | 54 + | JOSS | Software paper | rolling | DRAFTS | `joss-ac/`, `joss-kidlisp/` |

+92

papers/siggraph-asia-2026-tech/CONVERSION-NOTES.md

··· 1 + # SIGGRAPH Asia 2026 Technical Papers — Conversion Notes 2 + 3 + Target: **Technical Papers (full)** track, dual-track conference paper. 4 + 5 + ## Deadlines (AoE assumed unless stated otherwise — verify) 6 + 7 + - **Tue 2026-05-05** — Submission form / abstract deadline 8 + - **Tue 2026-05-12** — Full paper deadline 9 + - **Wed 2026-05-13** — Upload deadline 10 + 11 + Submit at https://ssl.linklings.net/conferences/siggraphasia/ 12 + 13 + ## Source 14 + 15 + `latency-source.tex` is a verbatim copy of `papers/arxiv-latency/latency.tex` 16 + (2026-04 working draft, "Where the Microseconds Go: Input and Audio Latency 17 + in AC Native OS"). The original is 6pp in custom AC two-column LaTeX. 18 + 19 + ## What needs to change for SIGGRAPH Asia 20 + 21 + ### 1. Template 22 + - Switch document class to `\documentclass[acmtog,anonymous,review]{acmart}` 23 + - Drop the custom `ac-paper-layout.sty`, AC fonts (`ywft-processing-*`), 24 + AC color palette, `draftwatermark`, and the custom `\titleformat` 25 + blocks — `acmart` provides all of this. 26 + - Keep `listings`, `booktabs`, `tabularx`, `natbib` (acmart includes a 27 + compatible bibliography stack — verify before duplicating). 28 + - Move from Latin Modern + AC display fonts to acmart's default 29 + Linux Libertine. 30 + 31 + ### 2. Page budget 32 + - Limit: **7 pages excluding references and figures-only pages**; 33 + at most **2 figures-only pages**. 34 + - Source is currently ~6pp arxiv two-column. acmart double-column at 35 + acmtog spacing is denser than the AC layout — expect ~5–6pp once 36 + reflowed. There is room to add a related-work section and one more 37 + figure without going over. 38 + 39 + ### 3. Anonymization (double-blind, mandatory) 40 + The source is NOT anonymous. Strip / rewrite: 41 + - Title block: `@jeffrey`, ORCID, `https://aesthetic.computer`, 42 + `pals.pdf` logo 43 + - Abstract: replace "I" with passive / "we"; remove "A letter for Parag" 44 + subtitle (deanonymizing dedication) 45 + - Body: remove or genericize `\acos{}`, `\ac{}`, "Aesthetic Computer", 46 + `notepat`, `fedac/native`, GitHub URLs, commit hashes that resolve 47 + to a public repo. Refer to "the runtime" / "the target OS" / "the 48 + test piece" instead. 49 + - Acknowledgements: omit at submission, restore for camera-ready. 50 + - Self-citations: cite as third party ("Scudder 2026 reports ...") 51 + and check that `\cite{}` keys do not leak handles. 52 + 53 + ### 4. Reframing for the graphics community 54 + SIGGRAPH Asia tech papers expects a graphics or interactive-systems 55 + contribution. The latency paper is HCI-systems leaning. Add or 56 + strengthen: 57 + - A subsection on **frame-pacing and display latency** (Wayland section 58 + already gestures at this — expand). 59 + - An evaluation that ties input/audio latency to **interactive 60 + graphics** outcomes (responsiveness in a real-time visual feedback 61 + loop, not just sound). 62 + - Related work: cite at minimum Jota et al. on touch latency, 63 + Ng et al. "Designing for Low-Latency Direct-Touch Input," 64 + and Casiez et al. on input-to-display latency. 65 + 66 + ### 5. Supplementary materials 67 + - Up to 500 MB. Allowed: code, data, video (5 min max), comparisons. 68 + - Suggested: a short video of `notepat` running on HDA-direct vs SOF 69 + hardware showing the audible difference; the latency-test harness 70 + output; the relevant `fedac/native` source files. 71 + 72 + ## Build 73 + 74 + Once converted, build with the standard SIGGRAPH route: 75 + 76 + ```bash 77 + cd papers/siggraph-asia-2026-tech 78 + xelatex paper.tex && bibtex paper && xelatex paper.tex && xelatex paper.tex 79 + ``` 80 + 81 + The oven papermill polls `papers/`, so once `paper.tex` exists and 82 + `SCORE.md` lists it, the auto-build pipeline will produce `paper.pdf`. 83 + 84 + ## Decision point 85 + 86 + A 7pp double-blind acmart paper from a 6pp arxiv draft is roughly 87 + **3–5 days of focused writing**. With the May 5 form / May 12 paper 88 + deadlines, the realistic call is: 89 + 90 + - Decide by **Mon 2026-05-04** whether to submit. 91 + - If yes: dedicate May 5–11 to the rewrite. 92 + - If no: archive this directory or move it to SIGGRAPH 2027 prep.

+443

papers/siggraph-asia-2026-tech/latency-source.tex

··· 1 + % !TEX program = xelatex 2 + \documentclass[10pt,letterpaper,twocolumn]{article} 3 + 4 + % === GEOMETRY === 5 + \usepackage[top=0.75in, bottom=0.75in, left=0.75in, right=0.75in]{geometry} 6 + 7 + % === FONTS === 8 + \usepackage{fontspec} 9 + \usepackage{unicode-math} 10 + 11 + \setmainfont{Latin Modern Roman} 12 + \setsansfont{Latin Modern Sans} 13 + 14 + \newfontfamily\acbold{ywft-processing-bold}[ 15 + Path=../../system/public/type/webfonts/, 16 + Extension=.ttf 17 + ] 18 + \newfontfamily\aclight{ywft-processing-light}[ 19 + Path=../../system/public/type/webfonts/, 20 + Extension=.ttf 21 + ] 22 + \setmonofont{Latin Modern Mono}[Scale=0.85] 23 + 24 + % === PACKAGES === 25 + \usepackage{xcolor} 26 + \usepackage{titlesec} 27 + \usepackage{enumitem} 28 + \usepackage{booktabs} 29 + \usepackage{tabularx} 30 + \usepackage{multicol} 31 + \usepackage{fancyhdr} 32 + \usepackage{hyperref} 33 + \usepackage{graphicx} 34 + \usepackage{ragged2e} 35 + \usepackage{microtype} 36 + \usepackage{listings} 37 + \usepackage{natbib} 38 + \usepackage[colorspec=0.92]{draftwatermark} 39 + 40 + % === COLORS (AC palette) === 41 + \definecolor{acpink}{RGB}{180,72,135} 42 + \definecolor{acpurple}{RGB}{120,80,180} 43 + \definecolor{acdark}{RGB}{64,56,74} 44 + \definecolor{acgray}{RGB}{119,119,119} 45 + \definecolor{draftcolor}{RGB}{180,72,135} 46 + 47 + % === DRAFT WATERMARK === 48 + \DraftwatermarkOptions{ 49 + text=WORKING DRAFT, 50 + fontsize=3cm, 51 + color=draftcolor!18, 52 + angle=45, 53 + pos={0.5\paperwidth, 0.5\paperheight} 54 + } 55 + 56 + % === C/JS SYNTAX COLORS === 57 + \definecolor{jskw}{RGB}{119,51,170} 58 + \definecolor{jsfn}{RGB}{0,136,170} 59 + \definecolor{jsstr}{RGB}{170,120,0} 60 + \definecolor{jsnum}{RGB}{204,0,102} 61 + \definecolor{jscmt}{RGB}{102,102,102} 62 + 63 + % === HYPERREF === 64 + \hypersetup{ 65 + colorlinks=true, 66 + linkcolor=acpurple, 67 + urlcolor=acpurple, 68 + citecolor=acpurple, 69 + pdfauthor={@jeffrey}, 70 + pdftitle={Where the Microseconds Go: Input and Audio Latency in AC Native OS}, 71 + } 72 + 73 + % === SECTION FORMATTING === 74 + \titleformat{\section} 75 + {\normalfont\bfseries\normalsize\uppercase} 76 + {\thesection.} 77 + {0.5em} 78 + {} 79 + \titlespacing{\section}{0pt}{1.2em}{0.3em} 80 + 81 + \titleformat{\subsection} 82 + {\normalfont\bfseries\small} 83 + {\thesubsection} 84 + {0.5em} 85 + {} 86 + \titlespacing{\subsection}{0pt}{0.8em}{0.2em} 87 + 88 + % === HEADER/FOOTER === 89 + \pagestyle{fancy} 90 + \fancyhf{} 91 + \renewcommand{\headrulewidth}{0pt} 92 + \fancyhead[C]{\footnotesize\color{acpink}\textit{Working Draft --- not for citation}} 93 + \fancyfoot[C]{\footnotesize\thepage} 94 + 95 + % === CUSTOM COMMANDS === 96 + \newcommand{\acdot}{{\color{acpink}.}} 97 + \newcommand{\ac}{\textsc{Aesthetic.Computer}} 98 + \newcommand{\acos}{\textsc{AC Native OS}} 99 + 100 + % === LISTINGS === 101 + \lstdefinelanguage{acc}{ 102 + morekeywords=[1]{const,static,struct,int,unsigned,void,if,else,return,while,for,sizeof,#define,#include}, 103 + morekeywords=[2]{snd_pcm_writei,snd_pcm_hw_params_set_period_size_near,read,poll,epoll_wait,clock_gettime,wl_keyboard,wl_display}, 104 + sensitive=true, 105 + morecomment=[l]{//}, 106 + morestring=[b]", 107 + } 108 + 109 + \lstdefinestyle{accstyle}{ 110 + language=acc, 111 + keywordstyle=[1]\color{jskw}\bfseries, 112 + keywordstyle=[2]\color{jsfn}\bfseries, 113 + commentstyle=\color{jscmt}\itshape, 114 + stringstyle=\color{jsstr}, 115 + } 116 + 117 + \lstset{ 118 + basicstyle=\ttfamily\small, 119 + breaklines=true, 120 + frame=single, 121 + rulecolor=\color{acgray!30}, 122 + backgroundcolor=\color{acgray!5}, 123 + xleftmargin=0.5em, 124 + xrightmargin=0.5em, 125 + aboveskip=0.5em, 126 + belowskip=0.5em, 127 + } 128 + 129 + % === LIST SETTINGS === 130 + \setlist[itemize]{nosep, leftmargin=1.2em, itemsep=0.1em} 131 + \setlist[enumerate]{nosep, leftmargin=1.2em} 132 + 133 + \setlength{\columnsep}{1.8em} 134 + \setlength{\parindent}{1em} 135 + \setlength{\parskip}{0.3em} 136 + 137 + \tolerance=800 138 + \emergencystretch=1em 139 + \hyphenpenalty=50 140 + 141 + \begin{document} 142 + 143 + % ============ TITLE BLOCK ============ 144 + 145 + \twocolumn[{% 146 + \begin{center} 147 + \includegraphics[height=4em]{pals}\par\vspace{0.5em} 148 + {\acbold\fontsize{22pt}{26pt}\selectfont\color{acdark} Where the Microseconds Go}\par 149 + \vspace{0.2em} 150 + {\aclight\fontsize{11pt}{13pt}\selectfont\color{acpink} Input and Audio Latency in AC Native OS}\par 151 + \vspace{0.3em} 152 + {\aclight\fontsize{9pt}{11pt}\selectfont\color{acgray} A letter for Parag, on what an interrupt is and how close to physics we already are}\par 153 + \vspace{0.6em} 154 + {\normalsize\href{https://prompt.ac/@jeffrey}{@jeffrey}}\par 155 + {\small\color{acgray} Aesthetic.Computer}\par 156 + {\small\color{acgray} ORCID: \href{https://orcid.org/0009-0007-4460-4913}{0009-0007-4460-4913}}\par 157 + \vspace{0.3em} 158 + {\small\color{acpurple} \url{https://aesthetic.computer}}\par 159 + \vspace{0.6em} 160 + \rule{\textwidth}{1.5pt} 161 + \vspace{0.5em} 162 + \end{center} 163 + 164 + \begin{center} 165 + {\small\color{acpink}\textbf{[ working draft --- not for citation ]}} 166 + \end{center} 167 + \vspace{0.3em} 168 + 169 + \begin{quote} 170 + \small\noindent\textbf{Abstract.} 171 + This paper, inspired by questions from Parag about what an IRQ is and whether stacking display servers makes a computer feel slower, walks the keypress-to-sound path inside \acos{} from the keyboard's USB host controller IRQ down to the audio codec's DMA engine. I quantify each layer the signal must cross, compare the values measured in \acos{} today against the theoretical floor set by physics and minimum kernel work, and trace the commit-by-commit history of how the chromatic keyboard piece \texttt{notepat} arrived at its current numbers --- a story that begins not on bare metal but in the browser, where on 2026-02-15 a measured \textbf{417\,ms} keyboard-to-sound latency in the web build of \texttt{notepat} (against a 30\,ms target) made it clear that no amount of profiling inside Chrome would close the gap. \acos{} was the response. It now runs ALSA at a 192-frame period at 192\,kHz ($\approx$1\,ms hardware turnaround) on HDA-direct codecs, falling back to 10--20\,ms periods on Sound Open Firmware (SOF) platforms whose DAPM models cannot tolerate sub-period scheduling pressure. Wayland was tried (via the \texttt{cage} compositor on a NixOS prototype) and removed --- the bare-metal build runs DRM-direct on \texttt{evdev}, eliminating the entire \texttt{evdev}\,$\rightarrow$\,\texttt{libinput}\,$\rightarrow$\,\texttt{cage}\,$\rightarrow$\,\texttt{wl\_pointer} input chain because each abstraction layer either adds a context switch (\,$\mu$s, harmless) or a buffer turnaround (ms or one frame, audible). The realistic floor is approximately 2\,ms key-to-DAC; we are at roughly 3--4\,ms on HDA hardware and 12--22\,ms on SOF. The macOS sibling port makes the thesis even sharper: switching from SDL3's audio stream to a direct CoreAudio backend dropped the measured median from 6.47\,ms to 0.65\,ms on the same hardware --- a 10$\times$ reduction that no buffer-size change could reach, because the bottleneck was a layer we had not noticed adding. The remaining gap is not algorithmic --- it is the cost of supporting hardware whose firmware demands buffering we do not need. 172 + \end{quote} 173 + \vspace{0.5em} 174 + }] 175 + 176 + % ============ 1. INTRODUCTION ============ 177 + 178 + \section{Introduction} 179 + 180 + Parag, an IRQ --- an \emph{interrupt request} --- is the hardware equivalent of someone tapping the CPU on the shoulder mid-sentence. Every modern CPU is, by default, ignoring almost everything. It runs whatever the kernel's scheduler last placed in front of it, and only stops when a wire is pulled. The keyboard controller pulls a wire when a key changes; the sound card pulls a wire when its DMA buffer has shifted enough samples out the speaker amplifier that it is about to run dry. The kernel responds with an \emph{interrupt service routine} (ISR), reads the device's status, and either consumes the event or wakes a userspace process that was waiting. That, in five sentences, is the entire mechanism by which a press of the spacebar eventually becomes a sound: a piece of physics (the key bottoming out closes a circuit) becomes a USB transaction, becomes an IRQ, becomes a wakeup, becomes a synth voice, becomes another DMA buffer, becomes a vibration in the air. 181 + 182 + The interesting question, the one you actually asked, is: how long does that path take, and how much of the time is spent on things we cannot avoid versus things we have chosen to put in the way? This paper answers that question for \acos{}~\citep{scudder2026os}, the bare-metal Linux build that powers the chromatic-keyboard piece \texttt{notepat}~\citep{scudder2026notepat}. The system was designed from the start as a musical instrument: the relevant performance metric is not throughput but the latency of a single keypress reaching the DAC, and the jitter on that latency. Wessel and Wright argued in 2002 that any tool intended for intimate musical control should keep this number below 10\,ms~\citep{wessel2002problems}; McPherson and colleagues showed in 2016 that most general-purpose computing stacks are nowhere close~\citep{mcpherson2016action}, motivating the dedicated Bela platform~\citep{jack2018sub,bela}. \acos{} is not Bela. It runs on whatever surplus laptop you flash it onto. But its design intent is the same: minimize the layers between the key and the DAC, and put visible numbers next to each one that remains. 183 + 184 + The instrument did not start on bare metal. It started in a browser tab. \texttt{notepat} shipped first as a web piece, and the entire impetus to build a custom OS came from a specific measurement. On 2026-01-09 the \texttt{notepat} latency-testing harness was added (commit \texttt{001a0769a}); on 2026-01-29 a target was set --- \texttt{window.\_\_acLatencyHint = 0.003} (3\,ms, commit \texttt{22675462f}); on 2026-02-15 the harness measured a mean of \textbf{417.28\,ms} (commit \texttt{578154860}) against that target. That is the moment a custom OS became inevitable: the work to remove 14$\times$ of overhead from a Web Audio + AudioWorklet + browser-event-loop stack would exceed the work to write a Linux runtime that talks ALSA directly. The kiosk-OS scaffolding (FFOS, then FedAC, then ac-native) lands in the next two weeks; the ALSA audio engine itself appears on 2026-03-05 in commit \texttt{a94cc50ea}. By the time this paper is written, the same \texttt{notepat} that was at 417\,ms in a browser is at 3--4\,ms on a flashed surplus ThinkPad. 185 + 186 + I describe the keypress path~(\S\ref{sec:keypath}), the audio path~(\S\ref{sec:audiopath}), the current measured numbers against the theoretical floor~(\S\ref{sec:floor}), the role of Wayland and the direct-KMS alternative~(\S\ref{sec:wayland}), the commit history of \texttt{notepat} latency improvements~(\S\ref{sec:history}), and what is left to squeeze~(\S\ref{sec:future}). 187 + 188 + % ============ 2. THE KEYPRESS PATH ============ 189 + 190 + \section{The Keypress Path} 191 + \label{sec:keypath} 192 + 193 + In \texttt{fedac/native/src/input.c}, the chain from key to userspace runs as follows: 194 + 195 + \begin{enumerate} 196 + \item The user closes a key contact. Mechanical latency (debounce, scan-matrix sweep) is set by the keyboard's own microcontroller --- typically 1--5\,ms. 197 + \item The keyboard schedules a USB HID input report. The polling interval is 8\,ms on legacy low-speed devices, 1\,ms on full-speed devices, and 125\,$\mu$s on high-speed gaming keyboards (NuPhy analog HE, used by the development unit, polls at 1\,kHz). 198 + \item The xHCI host controller raises an IRQ when the URB completes. The Linux ISR runs in tens of microseconds. 199 + \item The USB HID driver decodes the report and posts an \texttt{input\_event} via the \texttt{evdev} subsystem~\citep{evdev}, which makes it readable on \texttt{/dev/input/event*}. 200 + \item \acos{}'s main loop, blocking in \texttt{poll()} on those file descriptors, wakes inside another tens of microseconds. The event is dispatched to the active piece's \texttt{act()} function via the QuickJS bridge~\citep{quickjs}. 201 + \item The piece updates state. If the event triggers a note, it calls \texttt{system.sound.play()}, which posts a voice descriptor to a lock-protected ring shared with the audio thread. 202 + \end{enumerate} 203 + 204 + The hardware floor on this path is the USB poll interval (1\,ms on a typical keyboard, 125\,$\mu$s on a gaming keyboard). Everything else is microseconds of kernel work plus a single context switch into the ac-native main loop. The runtime carries an extra subtlety: the NuPhy analog keyboard reports both as a generic HID keyboard and as a vendor-specific hidraw device that streams continuous pressure values. \acos{} reads the hidraw stream for analog pressure but suppresses the duplicate evdev events, since processing both would double-trigger every note (\texttt{input.c:478}). 205 + 206 + When a Wayland compositor is in the loop, step 5 changes: the compositor reads evdev via \texttt{libinput}~\citep{libinput}, decides which client has focus, and serializes the event into a \texttt{wl\_keyboard.key} message on the per-client Wayland socket~\citep{wayland}. The client wakes, dispatches it through \texttt{wl\_display\_dispatch}, and only then does the piece see it. Each hop is one socket write plus one context switch --- typically under 200\,$\mu$s on a quiet system, but the worst case is bounded by the compositor's own scheduling, not by the input stack. Comments in \texttt{input.c:1135} note that when no compositor advertises a seat, ac-native falls back to reading evdev directly --- the lower-latency path. 207 + 208 + % ============ 3. THE AUDIO PATH ============ 209 + 210 + \section{The Audio Path} 211 + \label{sec:audiopath} 212 + 213 + Audio in \acos{} is ALSA~\citep{alsa} all the way down. The configuration in \texttt{fedac/native/src/audio.h:11} declares: 214 + 215 + \begin{lstlisting}[style=accstyle] 216 + #define AUDIO_SAMPLE_RATE 192000 217 + #define AUDIO_PERIOD_SIZE 192 // ~1ms at 192kHz 218 + \end{lstlisting} 219 + 220 + The audio thread runs a tight loop: render \texttt{period\_frames} samples of mixed synth output into a buffer, then call \texttt{snd\_pcm\_writei()} to hand them to the codec. The codec's DMA engine consumes those samples on its own clock; when it has shifted out one full period, it raises an IRQ and the kernel wakes \texttt{snd\_pcm\_writei} so the next period can be written. The rendering and writing are double-buffered, so end-to-end latency is approximately \emph{period $\times$ number-of-periods-in-buffer}. 221 + 222 + \acos{} configures four periods of buffer (\texttt{audio.c:2275}). At the design rate this is 4\,ms of audio in flight at any moment, plus the analog delay of the codec and amplifier (typically $<$1\,ms). 223 + 224 + \subsection{Why two configurations} 225 + 226 + The codec landscape on x86\_64 laptops splits cleanly. Older HDA-direct codecs (Realtek ALC257 on a ThinkPad X13, e.g.) accept 192-frame periods without complaint. ChromeOS-era Intel platforms --- specifically Jasper Lake-class devices like the HP Chromebook G7 used as the SOF reference unit --- route audio through Sound Open Firmware (SOF)~\citep{sof}, a DSP coprocessor that runs DAPM (Dynamic Audio Power Management) state machines. The G7's audio path is the \texttt{sof-rt5682} machine driver wired to a Realtek RT5682 codec for headphones plus a MAX98360A I$^{2}$S speaker amplifier. Commit \texttt{3e3608733} (\emph{native: SOF-aware audio period sizing}) records the discovery: with a 1\,ms period and 4\,ms buffer the SOF firmware logged \emph{10{,}686 sdmode toggles per boot} --- once per period, matching the $\sim$5\,ms toggle spacing exactly --- a DAPM amp-storm that desynchronized the speaker and produced silence even though the ALSA layer looked fine. The fix was to detect SOF (presence of \texttt{sound/soc/sof} card and absence of legacy HDA codec97) and back off the period to 10\,ms (480 frames at 48\,kHz) with a 40\,ms buffer. 227 + 228 + A second G7-specific bug surfaced the next day: \texttt{sof-rt5682} has no upstream ALSA UCM (Use Case Manager) definitions, so the DAPM graph stayed in the topology's init state and the Speakers path was disabled regardless of mixer state. Commit \texttt{cad9313e4} (\emph{native: bundle ChromeOS UCM + activate Speaker verb}) bundles the downstream \texttt{WeirdTreeThing/alsa-ucm-conf-cros} tree into the initramfs at image-build time and activates the Speaker verb after card open. ``On HDA-direct machines, \texttt{snd\_use\_case\_mgr\_open} returns -ENOENT and we fall through to the existing manual mixer unmute path unchanged.'' 229 + 230 + Even that turned out to be too tight: commit \texttt{ec143aca7} (\emph{native: bigger SOF buffer 20ms/80ms}) records continuing XRUN messages with short writes of 96 frames out of 480, indicating the buffer was draining faster than the userspace mixer could refill it under boot-time load. The current production setting on SOF hardware is a 20\,ms period in an 80\,ms buffer. 231 + 232 + This is the central asymmetry of audio latency on modern laptops: \emph{the kernel can do better, but the firmware cannot}. SOF is not a Linux limitation. It is a hardware-vendor decision to put a coprocessor between the OS and the DAC, and that coprocessor's state machine has its own latency floor. 233 + 234 + % ============ 4. CURRENT VS. THEORETICAL FLOOR ============ 235 + 236 + \section{Current vs. Theoretical Floor} 237 + \label{sec:floor} 238 + 239 + Table~\ref{tab:floor} sums the path. The upper section is the current measured/derived state; the middle is the floor set by physics and minimum kernel work; the lower row is the gap. 240 + 241 + \begin{table}[h] 242 + \small 243 + \centering 244 + \begin{tabular}{lrr} 245 + \toprule 246 + \textbf{Stage} & \textbf{Current} & \textbf{Floor} \\ 247 + \midrule 248 + Key contact + scan & 1--5\,ms & 1\,ms \\ 249 + USB HID poll & 0.125--1\,ms & 0.125\,ms \\ 250 + Kernel ISR + evdev & $<$50\,$\mu$s & $<$50\,$\mu$s \\ 251 + poll() wake $\rightarrow$ piece & $<$100\,$\mu$s & $<$100\,$\mu$s \\ 252 + QuickJS dispatch + synth voice & $<$100\,$\mu$s & $<$50\,$\mu$s \\ 253 + Voice $\rightarrow$ next ALSA period & up to 1 period & 0 (best case) \\ 254 + ALSA buffer drain & 4 ms (HDA) & 1--2\,ms \\ 255 + & 80 ms (SOF) & 80\,ms \\ 256 + DAC + amp & $<$1\,ms & $<$1\,ms \\ 257 + \midrule 258 + \textbf{Total (HDA-direct)} & \textbf{$\sim$3--4\,ms} & \textbf{$\sim$2\,ms} \\ 259 + \textbf{Total (SOF)} & \textbf{$\sim$12--22\,ms} & \textbf{$\sim$82\,ms*} \\ 260 + \bottomrule 261 + \end{tabular} 262 + \caption{Latency budget, key-to-DAC. *SOF floor is dominated by the firmware buffer ac-native cannot shrink without losing audio.} 263 + \label{tab:floor} 264 + \end{table} 265 + 266 + The HDA-direct number sits within the 5\,ms threshold below which McPherson et al. showed users cannot reliably distinguish action from sound~\citep{mcpherson2016action}. The SOF number does not. There is no software fix on the Linux side: shrinking the SOF buffer reintroduces the DAPM amp-storm. The only paths to a smaller SOF floor are (a) firmware changes upstream, (b) a kernel patch that reroutes the DAPM events out of the audio fast path, or (c) selecting hardware whose codec is HDA-direct. 267 + 268 + For comparison, the \texttt{notepat} macOS port (\texttt{fedac/native/macos/}) on Apple Silicon shipped with two backends so they could be A/B tested. The numbers from the \texttt{AC\_LATENCY\_TEST=40} benchmark in commit \texttt{c6e740192}, with \texttt{AC\_AUDIO\_BUFFER=32}: 269 + 270 + \begin{table}[h] 271 + \small 272 + \centering 273 + \begin{tabular}{lrrrr} 274 + \toprule 275 + \textbf{Backend} & \textbf{min} & \textbf{median} & \textbf{mean} & \textbf{max} \\ 276 + \midrule 277 + SDL3~\citep{sdl3} & 1.42 & 6.47 & 5.99 & 7.32 \\ 278 + CoreAudio direct & 0.08 & \textbf{0.65} & 0.58 & 0.80 \\ 279 + \bottomrule 280 + \end{tabular} 281 + \caption{Mac key-to-sample latency in milliseconds (commit \texttt{c6e740192}).} 282 + \label{tab:macos} 283 + \end{table} 284 + 285 + That is roughly a 10$\times$ reduction from the same hardware, same buffer size, same synthesizer. The commit message draws the conclusion in plain language: \emph{``the bottleneck wasn't the buffer size, it was SDL3's audio stream layering its own schedule on top of CoreAudio's pipeline.''} The number 0.65\,ms is below the underlying CoreAudio scheduling floor we earlier believed in, because that floor turned out to be SDL's own indirection. This is the cleanest single example of the thesis of this paper: each layer between hardware and the app costs a buffer turnaround, and the layer is often invisible until you remove it. 286 + 287 + % ============ 5. WAYLAND, DIRECT KMS, AND DISPLAY ============ 288 + 289 + \section{Wayland, Direct KMS, and Display} 290 + \label{sec:wayland} 291 + 292 + Wayland does not touch audio --- it is a display and input protocol. The audio path described above runs identically with or without a compositor. So the question ``does Wayland add audio latency'' has a clean answer: no. The PipeWire/PulseAudio user-space audio servers \emph{would} add 5--20\,ms of resampling and buffering, but ac-native deliberately bypasses them. ALSA hw-direct. 293 + 294 + Wayland's effect on \emph{input} latency is best illustrated by the project's own choice to remove it. An early NixOS-targeted prototype ran ac-native as a Wayland client under the \texttt{cage} kiosk compositor, with input reaching the runtime via \texttt{evdev}\,$\rightarrow$\,\texttt{libinput}\,$\rightarrow$\,\texttt{cage}\,$\rightarrow$\,\texttt{wl\_pointer}\,/\,\texttt{wl\_keyboard}. On 2026-04-05 commit \texttt{28d7ee374} added a workaround --- ``Late \texttt{input\_poll} after display present halves Wayland pointer latency'' --- because the compositor was flushing motion events only after surface commit. One day later, on 2026-04-06, commit \texttt{104022f30} (\emph{drop cage/Wayland, run ac-native DRM-direct on NixOS}) removed the compositor entirely: 295 + 296 + \begin{quote}\small\itshape 297 + This eliminates the entire Wayland input stack (evdev $\rightarrow$ libinput $\rightarrow$ cage $\rightarrow$ wl\_pointer) that caused trackpad drift, cursor conflicts, and input latency. ac-native now owns DRM and evdev directly, same as the bare-metal build. Net: -59 lines of config, zero abstraction layers between ac-native and the hardware. 298 + \end{quote} 299 + 300 + The codebase still ships \texttt{wayland-display.c} (394 lines) for the case where the OS is embedded in an existing Wayland session, alongside \texttt{drm-display.c} (987 lines) for the no-compositor path. When a compositor is present, \texttt{input.c:441,478} record a sharper concern than mere additional latency: the compositor \emph{grabs} the input devices, and reading them in parallel via evdev produces double-counted keys and progressively drifting cursor positions. The fix is conditional --- if a Wayland seat is advertised, evdev is suppressed; otherwise (\texttt{input.c:1135}) ac-native reads evdev directly. The bare-metal ChromeOS-on-Jasper-Lake build always takes the second path. 301 + 302 + For display, the cost of a compositor is one vsync of additional buffering (16.7\,ms at 60\,Hz) on top of whatever the application would otherwise hand to KMS. \texttt{drm-display.c} bypasses this: ac-native talks directly to DRM/KMS~\citep{drmkms} and pageflips its own framebuffer. 303 + 304 + The general rule: each layer between hardware and the application costs either a context switch ($\mu$s, free) or a buffer turnaround (one frame or one period, audible/visible). \acos{}'s design is to make the buffering layers \emph{optional}, not structural. 305 + 306 + % ============ 6. NOTEPAT LATENCY HISTORY ============ 307 + 308 + \section{The notepat Latency History} 309 + \label{sec:history} 310 + 311 + The chromatic keyboard piece \texttt{notepat} is the canonical instrument that runs on \acos{}. Its current feel is the result of about \emph{four months} of work spread across the web port, the bare-metal Linux runtime, and the macOS sibling --- not a single April debugging sprint. Read in order, the commits split into nine phases. Reading them as a sequence is the most honest answer to ``where does the present latency come from'' --- because almost none of the work was clever DSP or kernel hacking. It was identifying which layer was secretly buffering, and removing or tuning it. 312 + 313 + \subsection{Phase 0: The browser ceiling and the case for an OS (Jan 9 -- Feb 15)} 314 + 315 + \texttt{notepat} began as a web piece running inside \texttt{aesthetic.computer}'s browser runtime. The first attempt to measure its key-to-sound latency landed on 2026-01-09 in commit \texttt{001a0769a} (\emph{notepat latency testing + KidLisp boot animation}), which added an \texttt{artery/test-notepat-latency.mjs} harness that drove a headless browser, dispatched a synthetic keypress, and timed the appearance of the first non-zero audio sample via \texttt{\_\_bios\_sound\_telemetry}. Twenty days later, on 2026-01-29, commit \texttt{22675462f} (\emph{$\boldsymbol\lightning$ notepat: Reduce audio latency target to 3ms}) wrote the goal directly into the piece's \texttt{boot()} as a global hint: 316 + 317 + \begin{lstlisting}[style=accstyle,basicstyle=\ttfamily\footnotesize] 318 + window.|\textcolor{jskw}{\_\_acLatencyHint}| = |\textcolor{jsnum}{0.003}|; // 3ms target 319 + window.|\textcolor{jskw}{\_\_acSampleRate}| = |\textcolor{jsnum}{48000}|; 320 + window.|\textcolor{jskw}{\_\_acSpeakerPerformanceMode}| = |\textcolor{jsstr}{"disabled"}|; 321 + \end{lstlisting} 322 + 323 + Then on 2026-02-15, commit \texttt{578154860} (\emph{add test results showing 417ms latency regression in notepat}) recorded the measurement against that target: 324 + 325 + \begin{quote}\small\itshape 326 + Mean latency: 417.28\,ms (keyboard $\rightarrow$ sound detection). Target: $<$30\,ms. Performance degradation: $\sim$14$\times$ slower than target. Status: CRITICAL. Low variance (14\,ms std dev) suggests systematic overhead. 327 + \end{quote} 328 + 329 + The commit's analysis attributed the regression to recent UI work (waveform visualizer, KidLisp visualizer) and proposed Chrome DevTools profiling and OffscreenCanvas migrations. Those changes were made; they did not close the gap meaningfully. The browser stack --- Web Audio + AudioWorklet + cross-origin-isolation requirements + the page event loop + GC pauses --- has a floor that no amount of in-page work could push under tens of milliseconds. By the next week, scaffolding for a custom OS appears in the repo (FFOS on 2026-02-06, FedAC on 2026-02-21), and on 2026-03-05 commit \texttt{a94cc50ea} (\emph{feat(fedac): add ALSA audio engine}) lands the C-side multi-voice synthesizer that becomes the foundation of \acos{}'s current audio path. The native OS exists because the browser ceiling existed. 330 + 331 + \subsection{Phase 1: Make the speakers play at all (Apr 14--15)} 332 + 333 + Before latency could even be measured on the bare-metal build, the SOF-based HP Chromebook G7 (Jasper Lake, \texttt{sof-rt5682}, RT5682 + MAX98360A) had to actually emit sound. The relevant log line at the time was \emph{``everything else correct, but no audio.''} Commits \texttt{8e1663d72}, \texttt{6247baf17}, \texttt{cad9313e4}, \texttt{410c78476}, \texttt{0d423b19f}, \texttt{1c95ab767} walked through ChromeOS's UCM (Use Case Manager) configuration --- enumerating PCMs, picking the speaker verb, fixing the namespace prefix --- to get audio routed to the right output at all. \texttt{877c2336d} stopped enabling UCM Headphones at boot (was silencing speakers). \texttt{e851f5dce} stopped the mixer walk from re-enabling the headphone jack switch (was silencing the speaker amp). \texttt{eb960efd9} forced runtime power management off after the card probed. 334 + 335 + Then the breakthrough: \textbf{\texttt{3e3608733} (Apr 14, \emph{native: SOF-aware audio period sizing})}. The commit message names the bug exactly: 336 + 337 + \begin{quote}\small\itshape 338 + Root cause of silent G7 speakers with everything else correct: the old audio.c config hardcoded $\sim$1\,ms ALSA period (rate/1000 frames) + 4\,ms total buffer. Works fine on HDA-direct codecs (ThinkPad X13 etc.) but on SOF+MAX98360A the boot log shows \textbf{10{,}686 sdmode toggles per boot} --- once per period, matching the $\sim$5\,ms toggle spacing exactly. 339 + \end{quote} 340 + 341 + The 1\,ms period that was a feature on HDA hardware was actively breaking SOF: at 4\,ms buffer depth, the 16\,ms paint-loop submission cadence missed every period, the stream underran constantly, and the MAX98360A amplifier's DAPM event handler flipped \texttt{SD\_MODE} high and low faster than the chip could stabilize. Detection of SOF (presence of \texttt{sound/soc/sof} card and absence of legacy HDA codec97) and a bump to 10\,ms period / 40\,ms buffer fixed it. ``Same shape ChromeOS uses on Dedede'' --- the commit message explicitly borrows ChromeOS's known-good buffer geometry for this hardware family. 342 + 343 + \subsection{Phase 1.5: Drop cage, run DRM-direct (Apr 5--6)} 344 + 345 + A week before the SOF debugging, the runtime made the bigger structural decision: stop running under a Wayland compositor at all. The NixOS-targeted prototype had been using \texttt{cage} as a single-purpose kiosk compositor. On 2026-04-05, commit \texttt{28d7ee374} (\emph{wifi regulatory db, silent boot, trackpad latency}) landed a workaround --- ``Late \texttt{input\_poll} after display present halves Wayland pointer latency'' --- because the compositor was flushing motion events only after surface commit, doubling effective input latency. One day later, on 2026-04-06, commit \texttt{104022f30} removed the compositor outright: \emph{``This eliminates the entire Wayland input stack (evdev $\rightarrow$ libinput $\rightarrow$ cage $\rightarrow$ wl\_pointer) that caused trackpad drift, cursor conflicts, and input latency. Net: -59 lines of config, zero abstraction layers between ac-native and the hardware.''} \texttt{wayland-display.c} survives in the tree as an embed-into-someone-else's-session option, but the production build is DRM-direct. 346 + 347 + \subsection{Phase 2: The Goldilocks dither saga (Apr 15--16)} 348 + 349 + Once audio was reaching the amp, a second SOF bug emerged: the silence detector inside the SOF DSP would gate the speaker pipeline whenever the buffer fell below a few dB FS, even briefly. Sustained tones cut off mid-note. The fix was to inject inaudible keepalive dither. 350 + 351 + \begin{description} 352 + \item[\texttt{319732304}] (Apr 15, evening) injected $\pm$1\,LSB dither when the buffer would otherwise be all zeros. \emph{``Should be inaudible; enough to hold the SSP1 BE DAI active.''} 353 + \item[\texttt{e075ebac5}] (Apr 15, $\sim$1\,h later) bumped to $\pm$160. The $\pm$1 dither extended the sustain only to 96\,s before the silence detector gave up. 354 + \item[\texttt{7add48bb5}] (Apr 15, evening) reduced back to $\pm$1. \emph{``$\pm$160 was audible as a 24\,kHz fizz.''} 355 + \item[\texttt{ec143aca7}] (Apr 16) settled on $\pm$32 (\emph{``-72 dBFS at S32\_LE, still inaudible''}) and at the same time raised the SOF period/buffer from 10/40\,ms to \textbf{20/80\,ms}: the 10\,ms buffer was too tight, producing XRUNs and short writes (96 of 480 frames). 356 + \end{description} 357 + 358 + This is the latency floor for SOF hardware in the current codebase: 80\,ms of buffering, sitting on top of a firmware that requires it. No cleverness in the synth or the JS layer can recover those milliseconds. 359 + 360 + \subsection{Phase 3: Format and signal level (Apr 15--16)} 361 + 362 + \texttt{f246470ea} preferred \texttt{plughw:} for the speaker PCM and logged the negotiated format. \texttt{72476348f} negotiated \textbf{S32\_LE} for the SOF path (the topology accepts it; converting in userspace eliminated a class of crunchy/quiet artifacts that had been misdiagnosed as XRUNs). \texttt{474237ee4} added a tanh soft-limiter and shaped dither for cleaner peak handling. \texttt{b7ab5a5de}, \texttt{c28b18eef}, \texttt{f6ca477f9}, \texttt{153ce7c09} are the loudness arc: $-80$\,dB attenuation bug fix, then $+4$\,dB peak via a 0.85 soft-clip knee, then default volume tuning on SOF (which came up quieter than HDA). None of these are latency commits per se, but several of them were attempts to fix what \emph{sounded} like latency (sluggish notes, late-arriving transients) and turned out to be amplitude or format problems. 363 + 364 + \subsection{Phase 4: Synth quality and perceptual tightness (Apr 17--21)} 365 + 366 + A note's perceived onset latency depends as much on the transient shape of the synthesized voice as on the audio buffer. \texttt{cf3ca7f43} replaced the noise-based default voice with a Karplus-Strong plucked string --- a delay-line waveguide whose first half-cycle has the entire excitation in it, so the attack lands inside the first millisecond regardless of buffer size. \texttt{3928613fe} extracted a shared \texttt{synth\_core} so the same model set runs on both bare-metal Linux and the macOS port (parity makes A/B testing meaningful). \texttt{803188b95} fixed harp loudness and sustain, \texttt{93b3568b0} keyed Shift-pluck on duration, \texttt{6a5e6cf3e} added a master volume + drive (tanh soft-sat) FX block. 367 + 368 + The two perceptual-only commits are worth calling out: 369 + 370 + \begin{description} 371 + \item[\texttt{608a746fb}] (Apr 21, \emph{reverse replay is locked to visual cursor + capture pauses}). Visual feedback was drifting against the audio playback; the commit pinned the visual cursor directly to elapsed-since-press wall-clock time and paused the capture ring during reverse hold. The synth's audio latency did not change, but the user-reported \emph{``hjanky''} feel went away. 372 + \item[\texttt{7a2e69f92}] (Apr 21, \emph{wobble/flange FX + snap-release}). Made the envelope's release \emph{snap} to zero instead of easing back over $\sim$20 frames. Quote from the commit: \emph{``those $\sim$333\,ms of sweeping visual had no sound behind them. User reported this as `extra dead silent time at the end of every reverse gesture' and noted it made the snap-back feel unresponsive.''} A pure perceptual fix --- no audio latency reduction, but the instrument felt 333\,ms faster on release. 373 + \end{description} 374 + 375 + \subsection{Phase 5: NuPhy analog input (Apr 18--20)} 376 + 377 + The NuPhy HE is an analog Hall-effect keyboard that exposes per-key pressure values via a vendor hidraw report alongside its standard HID keyboard interface. Three commits matter: 378 + 379 + \begin{description} 380 + \item[\texttt{d8b28e65c}] simplified the evdev filter to suppress NuPhy keys when the hidraw stream is active. Without this, every key fired twice (once from evdev, once from the analog handler), audible as a flam. 381 + \item[\texttt{f27960700}] added analog smoothing, a dark theme, and boot performance and media-key fixes in one batch. 382 + \item[\textbf{\texttt{18880d7a8}}] (Apr 20, \emph{velocity-capture + pressure smoothing + NuPhy badge/gauge}). Diagnosed audible popping on sustained sine tones: \emph{``raw pressure samples arrive at driver-specific ADC step rates and the sim loop was writing every raw value straight into synth.update() each frame, so discrete pressure steps produced audible clicks.''} Fix: a one-pole lowpass ($\alpha$=0.20, $\sim$80\,ms time constant) plus rate-limit synth.update() to changes $>$0.5\%. This is the single commit where input-side latency was deliberately \emph{added} (80\,ms of pressure smoothing) to remove an audio artifact. The note onset path is unchanged --- key-down still triggers the synth voice on the same frame. 383 + \end{description} 384 + 385 + \subsection{Phase 6: macOS port latency arc (Apr 18)} 386 + 387 + The macOS port (\texttt{fedac/native/macos/}, SDL3 + CoreAudio) gave us the cleanest A/B benchmark of the whole project --- and the most surprising finding. 388 + 389 + \begin{description} 390 + \item[\texttt{04dea9da7}] (Apr 18, 16:53) requested a 128-frame device buffer via \texttt{SDL\_HINT\_AUDIO\_DEVICE\_SAMPLE\_FRAMES}. CoreAudio's default was 2048--4096 frames (40--85\,ms). 128 frames at 48\,kHz is $\sim$2.7\,ms. 391 + \item[\texttt{c8256aa29}] (Apr 18, 17:13) added the \texttt{AC\_LATENCY\_TEST=$n$} benchmark: vsync off, $n$ back-to-back keypress injections, with rearm/settle between each, prints min/median/mean/max + sample list. Tightened the buffer to 64 frames. Median held at $\sim$6.4\,ms; jitter ceiling fell from $\sim$11\,ms to $\sim$7\,ms. The commit message concluded \emph{``smaller buffers don't lower the median further but can hit the min''} --- which we interpreted as the CoreAudio pipeline floor. 392 + \item[\textbf{\texttt{c6e740192}}] (Apr 18, 17:32) tested that conclusion. A direct \texttt{kAudioUnitSubType\_DefaultOutput} backend was added alongside the SDL3 one, with \texttt{kAudioDevicePropertyBufferFrameSize} set on the device before AU instantiation. The benchmark in Table~\ref{tab:macos} dropped the median from 6.47\,ms to 0.65\,ms --- a 10$\times$ improvement that no buffer-size tweak had been able to reach. The bottleneck was not CoreAudio. It was SDL3's audio-stream abstraction layering its own schedule on top. 393 + \end{description} 394 + 395 + This is the pattern of the entire project condensed into 39 minutes of git history: a buffering layer that no one named was costing more than the layer that was named. The fix was to remove it. 396 + 397 + \subsection{Phase 7: Boot acceleration (Apr 20)} 398 + 399 + Latency to \emph{first} note also includes how fast the OS reaches a playable state from cold. \texttt{bc16acb76} backgrounded diagnostic dumps and ran USB mount/GPU wait in parallel during init. \texttt{11c6a6ff8} cut the startup-fade boot animation from 3\,s to 2\,s with a matrix-rain background, and explicitly framed the change as \emph{``boot into notepat faster.''} Time from power button to interactive note dropped to 7.3\,s. None of this is per-keypress latency, but it sets the threshold above which the device feels like a tool versus a toy. 400 + 401 + \subsection{What the history says} 402 + 403 + These seven phases describe maybe 40 commits across nine days, almost all of them merged on the same branch as ordinary product work. There is no separate ``latency project.'' The numbers got better because every time a user-reported feel issue (\emph{popping, crunchy, dead silent time, hjanky}) was investigated, the investigation forced a layer to become visible. Sometimes the answer was buffer size; more often it was a firmware silence detector, an unwanted abstraction, an envelope shape, or a duplicated input device. The instrument is fast not because it was optimized, but because each obstruction was identified and removed in turn. 404 + 405 + % ============ 7. WHAT IS LEFT TO SQUEEZE ============ 406 + 407 + \section{What is Left to Squeeze} 408 + \label{sec:future} 409 + 410 + \begin{itemize} 411 + \item \textbf{Direct evdev grab on Wayland.} Currently when a compositor is present, ac-native disables evdev to avoid double-input. A small refactor would let ac-native take an exclusive grab via \texttt{EVIOCGRAB} and skip the Wayland keyboard hop entirely while remaining a Wayland client for display. Saves $\sim$200\,$\mu$s and eliminates an entire scheduling boundary. 412 + 413 + \item \textbf{ALSA mmap mode.} \texttt{snd\_pcm\_writei} copies frames into the kernel's ring; \texttt{snd\_pcm\_mmap\_writei} maps the ring into userspace and removes the copy. On HDA paths this could trim a fraction of a millisecond. On SOF the firmware buffer dominates so the change is invisible. 414 + 415 + \item \textbf{IRQ thread priority and CPU pinning.} Linux RT throttling and IRQ-thread priorities are at default values. Pinning the audio thread to a single CPU and elevating the relevant IRQ thread to \texttt{SCHED\_FIFO} would tighten the jitter ceiling, though probably not the median. The cost is making the system less polite to other workloads, which on a single-piece appliance is acceptable. 416 + 417 + \item \textbf{High-poll-rate keyboards.} A 1\,kHz USB polling rate is already standard on the development NuPhy. Moving to an 8\,kHz polling keyboard shaves another $\sim$0.5\,ms in the worst case. 418 + 419 + \item \textbf{HDA-only build.} An option flag at build time to refuse SOF hardware would let the synth use a 1\,ms period unconditionally. This is a hardware-selection decision dressed up as a software switch, but it is honest about the trade. 420 + \end{itemize} 421 + 422 + None of these will produce a dramatic step change. They will each save microseconds to a few milliseconds. The dramatic step changes are behind us, and they were not where you would think. The single biggest one was leaving the browser: 417\,ms (web) $\rightarrow$ 3--4\,ms (bare-metal HDA), about a 100$\times$ improvement, achieved by writing a Linux runtime instead of trying to optimize Web Audio further. The others, in descending order: removing the SDL3 audio abstraction on macOS (6.47 $\rightarrow$ 0.65\,ms median, 10$\times$); dropping the \texttt{cage} Wayland compositor for DRM-direct on NixOS (saved trackpad drift plus an unbounded compositor-scheduled latency tail); tuning the ALSA period to the codec's ceiling on HDA (4\,ms buffer, $\sim$1\,ms period); identifying the SOF DAPM trap (prevented an entire class of laptops from emitting audio at all). Each of these was an abstraction layer somebody else had quietly added that turned out to cost a buffer turnaround. 423 + 424 + % ============ 8. CONCLUSION ============ 425 + 426 + \section{Conclusion} 427 + 428 + So, Parag: an interrupt is a wire that the CPU has agreed to listen to. The keyboard pulls one when a key changes. The sound card pulls one when its DMA buffer is about to run dry. Between those two events the kernel does roughly 100\,$\mu$s of work and userspace does another 100\,$\mu$s --- those are not the latency. The latency is the buffers we keep in flight to absorb scheduling jitter, plus whatever firmware sits between the OS and the silicon, plus whatever userspace abstractions someone added before us. On a ThinkPad with HDA-direct audio, \acos{} is at $\sim$3--4\,ms key-to-DAC, against a $\sim$2\,ms physics floor. On an HP Chromebook G7 with SOF, the floor itself is $\sim$80\,ms. The same instrument was at 417\,ms in a browser earlier this year. The native OS exists because that number could not be moved by working inside the browser; it can be moved by leaving. 429 + 430 + The right question for an instrument is never ``how low can the average go,'' it is ``how predictable is the worst case.'' That is what the recent work has been about. The audible thresholds (10\,ms for intimate musical control, 20\,ms before the action-sound bond starts to break) are met on the hardware we recommend. They are not yet met on every laptop in a thrift bin --- not because the kernel cannot deliver, but because the firmware sometimes will not let it. 431 + 432 + \vspace{0.5em} 433 + \noindent\textbf{Acknowledgments.} For Parag, who asked the question that made this paper worth writing. The numbers in this paper are derived from the public commit history of the \texttt{aesthetic-computer} repository as of April 2026. 434 + 435 + \vspace{0.5em} 436 + \noindent\textbf{ORCID:} \href{https://orcid.org/0009-0007-4460-4913}{0009-0007-4460-4913} 437 + 438 + % ============ REFERENCES ============ 439 + 440 + \bibliographystyle{plainnat} 441 + \bibliography{references} 442 + 443 + \end{document}

+129

papers/siggraph-asia-2026-tech/references.bib

··· 1 + @misc{scudder2026ac, 2 + title={Aesthetic Computer '26: A Mobile-First Runtime for Creative Computing}, 3 + author={{@jeffrey}}, 4 + year={2026}, 5 + note={Companion paper describing the AC platform} 6 + } 7 + 8 + @misc{scudder2026os, 9 + title={AC Native OS '26: A Bare-Metal Creative Computing Operating System}, 10 + author={{@jeffrey}}, 11 + year={2026}, 12 + note={Companion paper describing the bare-metal Linux build} 13 + } 14 + 15 + @misc{scudder2026notepat, 16 + title={notepat.com: From Keyboard Toy to System Front Door}, 17 + author={{@jeffrey}}, 18 + year={2026}, 19 + note={Companion paper on the chromatic keyboard piece} 20 + } 21 + 22 + @misc{linuxkernel, 23 + title={The Linux Kernel}, 24 + author={{Linux Kernel Contributors}}, 25 + year={2026}, 26 + note={Version 6.14.2 used by AC Native OS}, 27 + url={https://www.kernel.org/} 28 + } 29 + 30 + @misc{alsa, 31 + title={Advanced Linux Sound Architecture (ALSA) Project}, 32 + author={{ALSA Project}}, 33 + year={2026}, 34 + url={https://www.alsa-project.org/} 35 + } 36 + 37 + @misc{evdev, 38 + title={Linux Input Subsystem User-Space API (evdev)}, 39 + author={{Linux Kernel Documentation}}, 40 + year={2026}, 41 + note={Documentation/input/input.rst} 42 + } 43 + 44 + @misc{libinput, 45 + title={libinput --- A library to handle input devices}, 46 + author={Hutterer, Peter}, 47 + year={2026}, 48 + url={https://www.freedesktop.org/wiki/Software/libinput/} 49 + } 50 + 51 + @misc{wayland, 52 + title={Wayland Protocol Specification}, 53 + author={H{\"o}genauer, Kristian and others}, 54 + year={2025}, 55 + url={https://wayland.freedesktop.org/} 56 + } 57 + 58 + @misc{drmkms, 59 + title={Linux DRM/KMS Subsystem Documentation}, 60 + author={{Linux Kernel Documentation}}, 61 + year={2026}, 62 + note={Documentation/gpu/drm-kms.rst} 63 + } 64 + 65 + @misc{usbhid, 66 + title={Universal Serial Bus HID Usage Tables, Version 1.4}, 67 + author={{USB Implementers Forum}}, 68 + year={2022} 69 + } 70 + 71 + @misc{sof, 72 + title={Sound Open Firmware Project}, 73 + author={{Sound Open Firmware Contributors}}, 74 + year={2026}, 75 + url={https://thesofproject.github.io/latest/index.html} 76 + } 77 + 78 + @misc{coreaudio, 79 + title={Core Audio Overview}, 80 + author={{Apple Inc.}}, 81 + year={2025}, 82 + note={developer.apple.com/library/archive/documentation/MusicAudio/} 83 + } 84 + 85 + @misc{sdl3, 86 + title={Simple DirectMedia Layer 3.0}, 87 + author={Lantinga, Sam and others}, 88 + year={2025}, 89 + url={https://libsdl.org/} 90 + } 91 + 92 + @inproceedings{wessel2002problems, 93 + title={Problems and Prospects for Intimate Musical Control of Computers}, 94 + author={Wessel, David and Wright, Matthew}, 95 + booktitle={Computer Music Journal}, 96 + volume={26}, 97 + number={3}, 98 + pages={11--22}, 99 + year={2002} 100 + } 101 + 102 + @inproceedings{mcpherson2016action, 103 + title={Action-Sound Latency: Are Our Tools Fast Enough?}, 104 + author={McPherson, Andrew and Jack, Robert and Moro, Giulio}, 105 + booktitle={Proceedings of the International Conference on New Interfaces for Musical Expression (NIME)}, 106 + pages={20--25}, 107 + year={2016} 108 + } 109 + 110 + @inproceedings{jack2018sub, 111 + title={Sub-Millisecond Latency Audio with Bela}, 112 + author={Jack, Robert and Moro, Giulio and McPherson, Andrew}, 113 + booktitle={Audio Engineering Society Convention 144}, 114 + year={2018} 115 + } 116 + 117 + @misc{bela, 118 + title={Bela: Real-Time Audio and Sensors for Embedded Systems}, 119 + author={{Augmented Instruments Lab}}, 120 + year={2025}, 121 + url={https://bela.io/} 122 + } 123 + 124 + @misc{quickjs, 125 + title={QuickJS Javascript Engine}, 126 + author={Bellard, Fabrice}, 127 + year={2024}, 128 + url={https://bellard.org/quickjs/} 129 + }

Configure Feed

Configure Feed