this repo has no description
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Rename hw_* splits to typeset_structured/mixed; drop old typeset_* splits; pin unsloth

- data/: remove typeset_train/val/test and typeset_mixed_* DVC tracking (old single-equation
and font-homogeneous splits); add typeset_structured_* and typeset_mixed_* DVC files
(renamed from hw_*; font-diverse renders with expanded body grammar)
- data/fonts/handwriting/: commit TTF files directly (33-172KB each)
- src/data.py: update split names to typeset_structured_* / typeset_mixed_*
- pyproject.toml: pin unsloth==2026.4.5 (2026.4.4 dropped gemma-4-E2B-it support)
- uv.lock: update accordingly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

+39 -40
+3 -3
data/.gitignore
··· 9 9 /mathwriting_symbols 10 10 /mathwriting_val 11 11 /mathwriting_test 12 - /typeset_train 13 - /typeset_val 14 - /typeset_test 15 12 /typeset_mixed_test 16 13 /typeset_mixed_train 17 14 /typeset_mixed_val 15 + /typeset_structured_test 16 + /typeset_structured_train 17 + /typeset_structured_val
data/fonts/handwriting/ComicNeue-Regular.ttf

This is a binary file and will not be displayed.

data/fonts/handwriting/DancingScript-Regular.ttf

This is a binary file and will not be displayed.

data/fonts/handwriting/GochiHand-Regular.ttf

This is a binary file and will not be displayed.

data/fonts/handwriting/Handlee-Regular.ttf

This is a binary file and will not be displayed.

data/fonts/handwriting/Oswald-Regular.ttf

This is a binary file and will not be displayed.

data/fonts/handwriting/SpecialElite-Regular.ttf

This is a binary file and will not be displayed.

+3 -3
data/typeset_mixed_test.dvc
··· 1 1 outs: 2 - - md5: b6f14c8d1daf2e8fced78929501fc0e9.dir 3 - size: 41030794 4 - nfiles: 1001 2 + - md5: c1e3cf8782f435b3135f6918cfbb03c2.dir 3 + size: 32044732 4 + nfiles: 501 5 5 hash: md5 6 6 path: typeset_mixed_test
+3 -3
data/typeset_mixed_train.dvc
··· 1 1 outs: 2 - - md5: 746dfd1208dfc28aae2e4f72f0c3e223.dir 3 - size: 809063349 4 - nfiles: 20001 2 + - md5: 5cce037102c5c521ec574edfd0d562d3.dir 3 + size: 618883250 4 + nfiles: 10001 5 5 hash: md5 6 6 path: typeset_mixed_train
+2 -2
data/typeset_mixed_val.dvc
··· 1 1 outs: 2 - - md5: 6f7475635b383582765f24118f13bea8.dir 3 - size: 21202577 2 + - md5: f3c01f8cc4352d02b2c15fc4ab1a0815.dir 3 + size: 30922657 4 4 nfiles: 501 5 5 hash: md5 6 6 path: typeset_mixed_val
+6
data/typeset_structured_test.dvc
··· 1 + outs: 2 + - md5: cae0dd567ebd51e42a63553200135d33.dir 3 + size: 17131859 4 + nfiles: 501 5 + hash: md5 6 + path: typeset_structured_test
+6
data/typeset_structured_train.dvc
··· 1 + outs: 2 + - md5: ac0f899cdc71e9e397b453b5a3574ed5.dir 3 + size: 506504324 4 + nfiles: 15001 5 + hash: md5 6 + path: typeset_structured_train
+6
data/typeset_structured_val.dvc
··· 1 + outs: 2 + - md5: c568259a74b9d0814763fb39d0db3855.dir 3 + size: 17717698 4 + nfiles: 501 5 + hash: md5 6 + path: typeset_structured_val
-6
data/typeset_test.dvc
··· 1 - outs: 2 - - md5: 449e2c7a7ec816ba076c73d1f211458d.dir 3 - size: 4733810 4 - nfiles: 1002 5 - hash: md5 6 - path: typeset_test
-6
data/typeset_train.dvc
··· 1 - outs: 2 - - md5: aaba3cc2babfdb7c2c0fb63ed5a2cdf7.dir 3 - size: 39946565 4 - nfiles: 8002 5 - hash: md5 6 - path: typeset_train
-6
data/typeset_val.dvc
··· 1 - outs: 2 - - md5: 9da560a185c6393738acd0fb76460e31.dir 3 - size: 4586802 4 - nfiles: 1002 5 - hash: md5 6 - path: typeset_val
+1 -1
pyproject.toml
··· 3 3 version = "0.1.0" 4 4 requires-python = ">=3.12" 5 5 dependencies = [ 6 - "unsloth[colab-new]", 6 + "unsloth[colab-new]==2026.4.5", 7 7 "trl>=0.15", 8 8 "datasets>=2.19", 9 9 "Pillow>=10",
+3 -3
src/data.py
··· 28 28 "mathwriting_train", 29 29 "mathwriting_synthetic", 30 30 "mathwriting_symbols", 31 - "hw_structured_train", "hw_mixed_train", 31 + "typeset_structured_train", "typeset_mixed_train", 32 32 ] 33 33 VAL_SPLITS = [ 34 34 "mathwriting_val", 35 - "hw_structured_val", "hw_mixed_val", 35 + "typeset_structured_val", "typeset_mixed_val", 36 36 ] 37 37 TEST_SPLITS = [ 38 38 "mathwriting_test", 39 - "hw_structured_test", "hw_mixed_test", 39 + "typeset_structured_test", "typeset_mixed_test", 40 40 ] 41 41 42 42 # Splits whose manifest typst field is a bare math expression (no $ delimiters).
+6 -7
uv.lock
··· 2580 2580 { name = "torchvision", specifier = ">=0.18" }, 2581 2581 { name = "tqdm" }, 2582 2582 { name = "trl", specifier = ">=0.15" }, 2583 - { name = "unsloth", extras = ["colab-new"] }, 2583 + { name = "unsloth", extras = ["colab-new"], specifier = "==2026.4.5" }, 2584 2584 { name = "uvicorn", extras = ["standard"], specifier = ">=0.29" }, 2585 2585 ] 2586 2586 ··· 2612 2612 2613 2613 [[package]] 2614 2614 name = "unsloth" 2615 - version = "2026.4.4" 2615 + version = "2026.4.5" 2616 2616 source = { registry = "https://pypi.org/simple" } 2617 2617 dependencies = [ 2618 2618 { name = "accelerate" }, ··· 2643 2643 { name = "wheel" }, 2644 2644 { name = "xformers", marker = "(platform_machine == 'AMD64' and 'linux' in sys_platform) or (platform_machine == 'x86_64' and 'linux' in sys_platform) or (platform_machine == 'AMD64' and sys_platform == 'win32') or (platform_machine == 'x86_64' and sys_platform == 'win32')" }, 2645 2645 ] 2646 - sdist = { url = "https://files.pythonhosted.org/packages/12/54/52822f5ecec70d8ce4733164df302eee55b4aba1cb3860e47c58d809dcd9/unsloth-2026.4.4.tar.gz", hash = "sha256:5d5c0c1d5bd48886927e34c2d9b59ee610c882e917ba1362fe6209a7c3eea97d", size = 66783124, upload-time = "2026-04-06T16:40:57.746Z" } 2647 2646 wheels = [ 2648 - { url = "https://files.pythonhosted.org/packages/a8/80/f5246519a22f9b962f8d6364e980c83ca666b03aba693ef4223c70597000/unsloth-2026.4.4-py3-none-any.whl", hash = "sha256:da9d80a6f1a50b53ee641fbff254716945921c6ffc05d3bdbb676865a04cd27f", size = 62599611, upload-time = "2026-04-06T16:40:53.546Z" }, 2647 + { url = "https://files.pythonhosted.org/packages/d5/fc/c239aedf742c9509511fc39cecac76166ecc6bb6cca55d39273e52040504/unsloth-2026.4.5-py3-none-any.whl", hash = "sha256:428fa76a1be69888e87a7936a2d5d705977094e06c5b46db061dd6018d0af620", size = 62991720, upload-time = "2026-04-15T15:19:26.03Z" }, 2649 2648 ] 2650 2649 2651 2650 [package.optional-dependencies] ··· 2672 2671 2673 2672 [[package]] 2674 2673 name = "unsloth-zoo" 2675 - version = "2026.4.6" 2674 + version = "2026.4.7" 2676 2675 source = { registry = "https://pypi.org/simple" } 2677 2676 dependencies = [ 2678 2677 { name = "accelerate" }, ··· 2700 2699 { name = "tyro" }, 2701 2700 { name = "wheel" }, 2702 2701 ] 2703 - sdist = { url = "https://files.pythonhosted.org/packages/a4/ea/a0b38fc3977526513905b1135f4dcde994b2f5a5c712ed83121c8d4af1a1/unsloth_zoo-2026.4.6.tar.gz", hash = "sha256:78083d47774ef7efee8e9cb3e211a6a70c2746b3080c6cd1f0b2ba1c08199a0e", size = 385361, upload-time = "2026-04-09T14:45:59.712Z" } 2702 + sdist = { url = "https://files.pythonhosted.org/packages/87/a0/e009fae4a0cc3e832d159ff4df2019dd519c5de60d90871ffb458247d178/unsloth_zoo-2026.4.7.tar.gz", hash = "sha256:8c9e7c776fb994400b11a01210b66d42463663135668badf2ef3825069f0ae1e", size = 386455, upload-time = "2026-04-15T15:02:14.538Z" } 2704 2703 wheels = [ 2705 - { url = "https://files.pythonhosted.org/packages/51/23/bb59f2c00e25dbfefe65d636429658769c8e5fac8069dcaccd22a03238ce/unsloth_zoo-2026.4.6-py3-none-any.whl", hash = "sha256:326651efcb60d6124f702dd07cde3bfb85ad196ee6293a0deda9b76c4b3ee4ff", size = 418416, upload-time = "2026-04-09T14:45:58.387Z" }, 2704 + { url = "https://files.pythonhosted.org/packages/fb/98/4bb6974752304d6253cd52532a432362402b0a272abe6f2baf1d913afec6/unsloth_zoo-2026.4.7-py3-none-any.whl", hash = "sha256:32b8393c71d96b2b5edd27adb977ab3067604fc8fb68d33db6b7adbac3712b9f", size = 419610, upload-time = "2026-04-15T15:02:13.079Z" }, 2706 2705 ] 2707 2706 2708 2707 [[package]]