revert host retention changes (143605c..e5a2a14)
reverts all four host retention commits that caused production
restart loops. the interaction between reconciliation, dormant
logic, startup jitter, and cold-start ramp with ~2,800 hosts
was not testable via unit tests and each deploy regressed
relay-eval coverage (alternating 0%/97% from kubelet kills).
back to 80eca78 behavior: exhausted hosts stop after 15 failures,
cron handles re-discovery. stable baseline for 24+ hours at 97-99%.
the feature needs a local test harness that validates startup ramp
behavior against a realistic host table before any production deploy.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>