Clone this repository
For self-hosted knots, clone URLs may differ based on your setup.
Download tar.gz
- Copied CROHME+MathWriting raster splits from eff-mer with clean names
- Renamed data/typeset -> data/typeset_train for consistency
- Unified RASTER_ROOT+TYPESET_ROOT into single DATA_ROOT in src/data.py
- typeset_train now in TRAIN_SPLITS directly; train.py simplified accordingly
- DVC initialized; all 14 splits tracked as .dvc pointer files
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PEP 723 inline-dep scripts for isolated transformers==4.46.3 env:
- scripts/probe_deepseek.py: inference probe
- scripts/train_deepseek.py: QLoRA fine-tuning on Typst OCR data
Also adds addict, matplotlib, einops, easydict to pyproject deps
and a (non-functional) probe-deepseek entry point stub.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>