Switch to Python 3.12; 1 epoch; rebalance training caps
- requires-python >= 3.12, .python-version pinned to 3.12
- Default epochs 3 -> 1 (~59h on RTX 3060 at current step time)
- Cap mathwriting_train at 10k (was 42k; real but over-represented)
- Cap mathwriting_synthetic at 20k, crohme_gen_2019 at 15k (unchanged)
- Total training samples: ~91k, ~11.4k optimizer steps per epoch
- Keep eager attention (MLA absorption trick is already efficient;
flash-attn source compile is too heavy for available hardware)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>