Increase batch size to 2, halve grad_accum; update seed to 29979
batch=2, grad_accum=4 keeps effective batch 8 but improves GPU utilization.
~5.2 GB VRAM headroom at batch=1 should accommodate the extra activations.
Seed 42 -> 29979 for train/val sampling.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>