narrow persist_order to cover only dp.persist()
move resequenceFrame, heap dupe, and broadcast_queue.push() outside
the ordering lock. persist_order now covers only the DB persist call
and seq store — the minimum needed for monotonic sequence assignment.
this eliminates the cascade where producers spin on persist_order
while another producer is blocked in a full broadcast_queue.push().
slight out-of-order in the ring is acceptable — seq is embedded in
frame data and consumers/history track by seq.
metrics showed persist_order_spins_total dominating at ~1,100 hosts
(548M spins) while push_lock_spins was zero — confirming the critical
section width was the bottleneck.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>