feat(audio): STK waveguide whistle + shift accent modifier

Two significant audio changes and one control refactor:

1. **Whistle rewritten as digital waveguide (Cook/STK model)**

Previous implementation used two biquad resonators (main fundamental
Q=45 + formant Q=14) driven by white noise. User reported "still very
airy" after multiple tuning passes — biquads fundamentally cannot
produce a whistled tone, only filtered noise.

New implementation follows Perry Cook's STK Flute algorithm:

breath ──► (+) ──► jetDelay ──► NL(x*(x*x-1)) ──► dcBlock ──► (+) ──► boreDelay ──┬──► out
▲ ▲ │
│ −jetRefl·temp │ +endRefl·temp │
│ │ │
└───────── 1-pole LPF ◄───────────────────────────────┴──────────────────┘

Key insights from STK Flute.h / Faust flute.dsp research:

- The BORE delay line (length = SR/freq) is the primary resonator.
Its closed-loop feedback generates ALL harmonics automatically via
comb filtering. You don't need 5-10 biquads tuned to harmonics —
the delay line IS a comb filter with infinite teeth.
- The JET delay (0.32 × bore) models the air jet's travel time
across the embouchure. jetRatio controls flute vs pennywhistle vs
ocarina character.
- CRITICAL: the nonlinearity is y = x*(x*x - 1), NOT tanh. The cubic
has a negative-slope region at x=0 which makes it a limit-cycle
generator — it converts steady DC breath pressure into sustained
oscillation. tanh is monotonic and can only saturate; it cannot
self-oscillate. This was the core missing ingredient.
- The loop filter is a SIMPLE 1-pole LPF (coefficients 0.35/0.65),
not a high-Q biquad. The 1-pole provides just enough loop damping
to close the loop under unity gain + rolls off high harmonics for
natural bore damping. A biquad would kill the harmonics we're
trying to keep.
- Breath excitation must include a DC component: noise alone cannot
drive oscillation. `breath = dc * (1 + noise_gain*noise + vibrato)`.
- DC blocker after the cubic removes the bias the NL would otherwise
pump into the loop.

Added ACVoice fields: whistle_bore_buf[2048], whistle_jet_buf[512],
whistle_bore_w, whistle_jet_w, whistle_lp1, whistle_hp_x1/y1.
Removed biquad fields (whistle_main_*, whistle_formant_*).
Buffers cleared on voice init.

Source refs:
- https://ccrma.stanford.edu/software/stk/Flute_8h_source.html
- https://github.com/thestk/stk/blob/master/include/JetTable.h
- https://github.com/grame-cncm/faust/blob/master-dev/examples/physicalModeling/faust-stk/flute.dsp

2. **Shift-as-accent modifier** (notepat.mjs). Previous behavior: tap
shift toggled quickMode (changed attack from 0.005 to 0.002 and
release from 0.08 to 0.02 — subtle, rarely useful). User asked to
deprecate shift's current behavior and use it as an ACCENT modifier
— capitalized/shifted playing = louder, more impactful hits.

New behavior:
- Tap shift on its own: no-op (quickMode toggle removed)
- Hold shift while triggering a note: velocity × 1.4 (capped to 1.0)
- Hold shift while triggering a drum: drumVol × 1.5 (no cap, the
drum bus compressor handles peak control)

This works live — you can rapidly alternate shifted/unshifted hits
for dynamic accents without any mode switching. Natural "capital
letter = louder" feel.

3. **Drum bus peak compressor (audio.c)** — bundled from earlier work
that was cancelled mid-build. 0.95 threshold, 5ms attack, 200ms
release. Preserves individual hit dynamics while preventing the
soft_clip saturation that was making rapid kicks feel "stacking
quieter". Each hit's initial transient passes through at full
amplitude; sustained overlap gets gain-reduced gracefully.

4. **Full drum pad labels when space allows (notepat.mjs)** — bundled.
Uses PERCUSSION_NAMES ("kick", "snare", "splash", etc.) when the
pad is wide enough to fit the full word. Falls back to 3-char
abbreviations (BAS/SNR/SPL) only when tight. User: "across the
board in notepat no need to truncate terms".

Cancelled in-flight dynamic-coot (was at 96s) to bundle these changes
— net savings: one flash cycle, and the whistle rewrite needs testing
as a complete unit.

prompt.ac/@jeffrey 3 weeks ago 66ac0d9c 9c15b86a

+150 -90

3 changed files

expand all

fedac

native

pieces

notepat.mjs

src

audio.c

audio.h

+18 -3

fedac/native/pieces/notepat.mjs

··· 1390 1390 system?.jump?.("prompt"); 1391 1391 return; 1392 1392 } 1393 - if (key === "shift") { quickMode = !quickMode; return; } 1393 + // Shift key: LIVE-HELD accent modifier (was: quick mode toggle). 1394 + // shiftHeld is already tracked at the top of act(); note/drum triggers 1395 + // check it to boost velocity so capitalized/shifted playing feels 1396 + // like accenting. The old quick-mode toggle is deprecated — tap 1397 + // shift does nothing on its own now. 1398 + if (key === "shift") return; 1394 1399 if (key === "tab") { 1395 1400 const idx = (waveIndex + 1) % wavetypes.length; 1396 1401 setWave(wavetypes[idx], sound); ··· 1572 1577 const pan = Math.max(-0.8, Math.min(0.8, (semitones - 12) / 15)); 1573 1578 1574 1579 // Start at current pressure (or full for digital keys) 1575 - const velocity = e.pressure > 0 ? e.pressure : 1.0; 1580 + // Velocity starts at analog pressure (or full for digital keys). 1581 + // Shift-as-accent: when shiftHeld, boost velocity by 1.4× (capped 1582 + // at 1.0 so we don't exceed the engine's expected range for notes; 1583 + // drums will scale their volume separately via the accent path below). 1584 + let velocity = e.pressure > 0 ? e.pressure : 1.0; 1585 + if (shiftHeld) velocity = Math.min(1.0, velocity * 1.4); 1576 1586 // Per-side master mix — scales every note and drum on this grid. 1577 1587 const master = masterForSide(offset); 1578 1588 const baseVol = 0.15 + velocity * 0.55; // un-mastered; sim() re-applies master live ··· 1604 1614 // melodic voices (~2.1 weight) the kick still lands at ~0.5 vs 1605 1615 // each melody at ~0.17 — roughly 3× prominence. Then scale the 1606 1616 // whole thing by the per-side master so the user can balance L/R. 1607 - const drumVol = (1.0 + velocity * 0.8) * master; 1617 + // Base drum volume from velocity, then apply shift-accent boost 1618 + // (shift-held = capitalized = accented hit). Drums get more 1619 + // aggressive accent than notes since the compressor will rein 1620 + // them in anyway. 1621 + const accent = shiftHeld ? 1.5 : 1.0; 1622 + const drumVol = (1.0 + velocity * 0.8) * master * accent; 1608 1623 const drumPitch = Math.pow(2, effectivePitchShift()); 1609 1624 // Drums get their own pan (kit geometry + grid bias), not the 1610 1625 // QWERTY physical key position → pan. Left keys pan left, right pan right.

+115 -78

fedac/native/src/audio.c

··· 123 123 return env; 124 124 } 125 125 126 - static inline void whistle_set_resonator(double freq, double q, double sample_rate, 127 - double *c1, double *c2, double *gain) { 128 - freq = clampd(freq, 60.0, sample_rate * 0.45); 129 - q = clampd(q, 1.0, 64.0); 130 - 131 - double w0 = 2.0 * M_PI * freq / sample_rate; 132 - double damping = M_PI * freq / (q * sample_rate); 133 - double r = 1.0 - damping; 134 - if (r < 0.0) r = 0.0; 135 - if (r > 0.99995) r = 0.99995; 136 - 137 - *c1 = 2.0 * r * cos(w0); 138 - *c2 = -(r * r); 139 - *gain = 1.0 - r; 126 + // Fractional-delay read from a ring buffer. `delay` is in samples, allows 127 + // non-integer values via linear interpolation between adjacent samples. 128 + // Returns the sample `delay` positions behind the write cursor. 129 + static inline double whistle_frac_read(const float *buf, int N, int w, double delay) { 130 + if (delay < 0.0) delay = 0.0; 131 + if (delay > (double)(N - 2)) delay = (double)(N - 2); 132 + double rd = (double)w - delay; 133 + while (rd < 0.0) rd += (double)N; 134 + int i0 = (int)rd; 135 + int i1 = (i0 + 1) % N; 136 + double f = rd - (double)i0; 137 + return (double)buf[i0] * (1.0 - f) + (double)buf[i1] * f; 140 138 } 141 139 142 - static inline void whistle_update_coeffs(ACVoice *v, double sample_rate) { 143 - double freq = clampd(v->frequency, 110.0, sample_rate * 0.20); 144 - double vibrato = sin(2.0 * M_PI * v->whistle_vibrato_phase) * (0.0025 + 0.0015 * v->whistle_breath); 145 - double tuned = freq * (1.0 + vibrato); 146 - if (fabs(tuned - v->whistle_coeff_freq) < 0.35) return; 140 + // Cook/STK digital waveguide flute model. 141 + // The signal flow (see reports/research for full derivation): 142 + // 143 + // breath ──► (+) ──► jetDelay ──► NL(x*(x*x-1)) ──► dcBlock ──► (+) ──► boreDelay ──┬──► out 144 + // ▲ ▲ │ 145 + // │ −jetRefl·temp │ +endRefl·temp │ 146 + // │ │ │ 147 + // └───────── 1-pole LPF ◄───────────────────────────┴──────────────────┘ 148 + // 149 + // The BORE delay line (length = SR/freq) is the primary resonator. Its 150 + // closed-loop feedback generates ALL harmonics automatically via comb 151 + // filtering — the delay line is inherently a periodic waveguide that 152 + // sustains exactly at integer multiples of its natural pitch. 153 + // 154 + // The JET delay (length ≈ 0.32 × bore) models the air jet's travel time 155 + // across the embouchure hole. The cubic nonlinearity x*(x*x-1) has 156 + // negative-slope region at x=0 which makes it a LIMIT-CYCLE GENERATOR — 157 + // it converts steady DC breath pressure into sustained oscillation. 158 + // This is qualitatively different from tanh, which is monotonic and 159 + // can only saturate. 160 + // 161 + // The 1-pole LPF in the loop models bore losses (viscothermal damping) 162 + // so the tone darkens as harmonics decay faster than the fundamental. 163 + // 164 + // The DC blocker after the NL removes the bias the cubic would pump 165 + // into the bore loop, which would otherwise drive it into clipping. 166 + static inline double generate_whistle_sample(ACVoice *v, double sample_rate) { 167 + double env = compute_envelope(v); 168 + // Breath envelope — DC pressure component + noise modulation + vibrato. 169 + // CRITICAL: the DC component is what drives the nonlinearity into 170 + // self-oscillation. Without a steady DC term, noise alone cannot 171 + // sustain the limit cycle. 172 + double breath_target = 0.18 + 0.82 * sqrt(env); 173 + double breath_slew = env > v->whistle_breath ? 0.012 : 0.003; 174 + v->whistle_breath += (breath_target - v->whistle_breath) * breath_slew; 147 175 148 - // Very sharp main resonance — previous tunings (Q≈18, Q≈26) still let 149 - // too much broadband noise bleed through. Q≈45 gives a razor-thin 150 - // bandwidth around the fundamental so the whistle reads as pitched 151 - // even at low amplitudes. 152 - double main_q = 45.0 + tuned * 0.025; 153 - double formant_ratio = tuned < 700.0 ? 2.05 : 2.35; 154 - double formant_q = 14.0 + tuned * 0.005; 176 + // Vibrato LFO — ~5 Hz, small depth 177 + v->whistle_vibrato_phase += 5.0 / sample_rate; 178 + if (v->whistle_vibrato_phase >= 1.0) v->whistle_vibrato_phase -= 1.0; 179 + double vibrato = sin(2.0 * M_PI * v->whistle_vibrato_phase) * 0.03; 155 180 156 - whistle_set_resonator(tuned, main_q, sample_rate, 157 - &v->whistle_main_c1, &v->whistle_main_c2, &v->whistle_main_gain); 158 - whistle_set_resonator(tuned * formant_ratio, formant_q, sample_rate, 159 - &v->whistle_formant_c1, &v->whistle_formant_c2, &v->whistle_formant_gain); 160 - v->whistle_coeff_freq = tuned; 161 - } 181 + // Breath noise — multiplicatively modulates the DC breath pressure. 182 + // Low gain so the noise rides on top of the steady breath instead of 183 + // replacing it. Attack phase gets slightly more chiff. 184 + double white = ((double)xorshift32(&v->noise_seed) / (double)UINT32_MAX) * 2.0 - 1.0; 185 + double onset = 1.0 - env; 186 + double noise_gain = 0.08 + 0.05 * onset; 187 + double breath = v->whistle_breath * (1.0 + noise_gain * white + vibrato); 162 188 163 - static inline double whistle_tick_resonator(double input, double c1, double c2, double gain, 164 - double *y1, double *y2) { 165 - double y = input * gain + c1 * (*y1) + c2 * (*y2); 166 - *y2 = *y1; 167 - *y1 = y; 168 - return y; 169 - } 189 + // Bore and jet delay lengths — bore = SR/freq (one wavelength), 190 + // jet = 0.32 × bore (Cook's flute ratio; 0.45 for pennywhistle, 191 + // 0.5 for ocarina). Clamp to the delay buffer sizes. 192 + double freq = clampd(v->frequency, 110.0, sample_rate * 0.20); 193 + double bore_delay = sample_rate / freq; 194 + double jet_delay = bore_delay * 0.32; 195 + // Cap to buffer sizes with safety margin 196 + const int BORE_N = 2048; 197 + const int JET_N = 512; 198 + if (bore_delay > (double)(BORE_N - 2)) bore_delay = (double)(BORE_N - 2); 199 + if (jet_delay > (double)(JET_N - 2)) jet_delay = (double)(JET_N - 2); 170 200 171 - static inline double generate_whistle_sample(ACVoice *v, double sample_rate) { 172 - double env = compute_envelope(v); 173 - double white = ((double)xorshift32(&v->noise_seed) / (double)UINT32_MAX) * 2.0 - 1.0; 174 - double breath_target = 0.18 + 0.82 * sqrt(env); 175 - double breath_slew = env > v->whistle_breath ? 0.009 : 0.0025; 201 + // Read bore output and apply 1-pole loop LPF (models bore damping). 202 + // 0.35/0.65 coefficients give ~0.65 DC gain — closes the loop just 203 + // under unity so it sustains but doesn't blow up. The LPF rolls off 204 + // high harmonics so the tone darkens naturally, unlike a biquad 205 + // which would over-narrow the spectrum. 206 + double bore_out = whistle_frac_read(v->whistle_bore_buf, BORE_N, v->whistle_bore_w, bore_delay); 207 + v->whistle_lp1 = 0.35 * (-bore_out) + 0.65 * v->whistle_lp1; 208 + double temp = v->whistle_lp1; 176 209 177 - v->whistle_breath += (breath_target - v->whistle_breath) * breath_slew; 178 - v->whistle_vibrato_phase += (4.6 + v->frequency * 0.0025) / sample_rate; 179 - if (v->whistle_vibrato_phase >= 1.0) v->whistle_vibrato_phase -= 1.0; 210 + // Jet drive: breath pressure minus jet reflection from bore feedback 211 + double jet_refl = 0.5; 212 + double end_refl = 0.5; 213 + double pd = breath - jet_refl * temp; 180 214 181 - whistle_update_coeffs(v, sample_rate); 215 + // Write to jet delay, read back with fractional delay 216 + v->whistle_jet_buf[v->whistle_jet_w] = (float)pd; 217 + v->whistle_jet_w = (v->whistle_jet_w + 1) % JET_N; 218 + pd = whistle_frac_read(v->whistle_jet_buf, JET_N, v->whistle_jet_w, jet_delay); 182 219 183 - double onset = 1.0 - env; 184 - // Tight feedback — more energy stays resonating between cycles so the 185 - // tone is self-sustaining from the resonator, not re-injected from the 186 - // noise source every sample. 187 - double feedback = v->whistle_main_y1 * 2.0 - v->whistle_main_y2 * 0.95 + v->whistle_formant_y1 * 0.32; 188 - // Turbulence cranked way down. Previous (0.025 base + 0.12 onset + 0.03 189 - // breath) was still too much for the user. New values are roughly ⅓ of 190 - // the previous iteration and ≈ 1/10 of the original. The onset chiff is 191 - // now 0.03 (barely perceptible), breath modulation 0.008. Just enough 192 - // noise to excite the resonator, not enough to bleed through it. 193 - double turbulence = white * (0.008 + onset * 0.03 + v->whistle_breath * 0.008); 194 - double jet_drive = feedback * (1.05 + v->whistle_breath * 0.5) + turbulence; 195 - double jet_target = tanh(jet_drive); 220 + // THE CUBIC NONLINEARITY — y = x*(x*x - 1). Negative slope at x=0 221 + // creates a limit-cycle generator. This is the secret sauce that 222 + // makes the tone WHISTLE instead of being filtered noise. 223 + pd = pd * (pd * pd - 1.0); 224 + if (pd > 1.0) pd = 1.0; 225 + if (pd < -1.0) pd = -1.0; 196 226 197 - // Faster jet slew — note speaks almost immediately, chiff is effectively 198 - // gone. 199 - v->whistle_jet += (jet_target - v->whistle_jet) * (0.05 + v->whistle_breath * 0.15); 227 + // 1-pole DC blocker — removes the bias the cubic pumps into the loop. 228 + // y[n] = x[n] - x[n-1] + 0.995*y[n-1] 229 + double y = pd - v->whistle_hp_x1 + 0.995 * v->whistle_hp_y1; 230 + v->whistle_hp_x1 = pd; 231 + v->whistle_hp_y1 = y; 200 232 201 - double main = whistle_tick_resonator(v->whistle_jet * (1.35 + v->whistle_breath * 0.5), 202 - v->whistle_main_c1, v->whistle_main_c2, v->whistle_main_gain, 203 - &v->whistle_main_y1, &v->whistle_main_y2); 204 - double formant = whistle_tick_resonator(v->whistle_jet * (0.18 + v->whistle_breath * 0.1), 205 - v->whistle_formant_c1, v->whistle_formant_c2, v->whistle_formant_gain, 206 - &v->whistle_formant_y1, &v->whistle_formant_y2); 233 + // Close the bore loop: combine the NL-filtered jet output with the 234 + // end reflection from the bore delay. 235 + double into_bore = y + end_refl * temp; 236 + v->whistle_bore_buf[v->whistle_bore_w] = (float)into_bore; 237 + v->whistle_bore_w = (v->whistle_bore_w + 1) % BORE_N; 207 238 208 - // The output "air" layer is GONE. Previously this added unfiltered white 209 - // noise directly to the final mix, which was the main hiss you could 210 - // still hear over the resonator. Now all noise must pass through the 211 - // resonator filter before reaching the output. 212 - double s = main * 2.8 + formant * 0.38; 213 - return tanh(s * 1.6); 239 + // Output is a tap off the bore loop. 0.3 gain matches STK Flute. 240 + return 0.3 * into_bore; 214 241 } 215 242 216 243 static inline double compute_fade(ACVoice *v) { ··· 1277 1304 if (type == WAVE_NOISE) { 1278 1305 setup_noise_filter(v, (double)(audio->actual_rate ? audio->actual_rate : AUDIO_SAMPLE_RATE)); 1279 1306 } else if (type == WAVE_WHISTLE) { 1280 - v->whistle_coeff_freq = 0.0; 1281 - whistle_update_coeffs(v, (double)(audio->actual_rate ? audio->actual_rate : AUDIO_SAMPLE_RATE)); 1307 + // Clear the waveguide state — bore + jet delay buffers and the 1308 + // loop filter / DC blocker. Without this, leftover state from a 1309 + // previous voice reuse would produce startup artifacts. 1310 + memset(v->whistle_bore_buf, 0, sizeof(v->whistle_bore_buf)); 1311 + memset(v->whistle_jet_buf, 0, sizeof(v->whistle_jet_buf)); 1312 + v->whistle_bore_w = 0; 1313 + v->whistle_jet_w = 0; 1314 + v->whistle_breath = 0.0; 1315 + v->whistle_vibrato_phase = 0.0; 1316 + v->whistle_lp1 = 0.0; 1317 + v->whistle_hp_x1 = 0.0; 1318 + v->whistle_hp_y1 = 0.0; 1282 1319 } 1283 1320 1284 1321 pthread_mutex_unlock(&audio->lock);

+17 -9

fedac/native/src/audio.h

··· 51 51 double noise_b0, noise_b1, noise_b2, noise_a1, noise_a2; 52 52 double noise_x1, noise_x2, noise_y1, noise_y2; 53 53 uint32_t noise_seed; 54 - // Breath-excited whistle / ocarina resonator state 55 - double whistle_breath; 56 - double whistle_jet; 57 - double whistle_vibrato_phase; 58 - double whistle_coeff_freq; 59 - double whistle_main_y1, whistle_main_y2; 60 - double whistle_formant_y1, whistle_formant_y2; 61 - double whistle_main_c1, whistle_main_c2, whistle_main_gain; 62 - double whistle_formant_c1, whistle_formant_c2, whistle_formant_gain; 54 + // Digital waveguide flute/whistle state (Perry Cook STK Flute model) 55 + // See audio.c:generate_whistle_sample for algorithm notes. The bore 56 + // delay line is the primary resonator — its length sets pitch and 57 + // its feedback loop generates all the harmonics. The jet delay + 58 + // cubic nonlinearity drives the loop into sustained oscillation. 59 + double whistle_breath; // envelope-smoothed breath pressure 60 + double whistle_vibrato_phase; // 0..1 vibrato LFO phase 61 + double whistle_lp1; // 1-pole loop LPF state 62 + double whistle_hp_x1, whistle_hp_y1; // 1-pole DC blocker state 63 + // Bore delay line — up to ~2048 samples at 192kHz covers down to ~94 Hz. 64 + // Write cursor advances by 1 each tick; reads use fractional delay 65 + // indexing for smooth pitch. 66 + float whistle_bore_buf[2048]; 67 + int whistle_bore_w; 68 + // Jet delay line — shorter, models embouchure travel time (~0.32×bore). 69 + float whistle_jet_buf[512]; 70 + int whistle_jet_w; 63 71 } ACVoice; 64 72 65 73 typedef struct {

Configure Feed

Configure Feed