firefox + llama.cpp == very good prose.
0
fork

Configure Feed

Select the types of activity you want to include in your feed.

chore(readme) quickstart llaamaa

eagleusb e5977765 e7116590

+24
+24
README.md
··· 4 4 5 5 ![./assets/demo.png](./assets/demo.png) 6 6 7 + ## quickstart 8 + 9 + ```bash 10 + env | sort -u | grep -iP '^llama.*' 11 + LLAMA_ARG_CPU_MOE=true 12 + LLAMA_ARG_CTX_CHECKPOINTS=3 13 + LLAMA_ARG_DIO=true 14 + LLAMA_ARG_KV_UNIFIED=true 15 + LLAMA_ARG_PERF=false 16 + LLAMA_ARG_SWA_FULL=true 17 + LLAMA_LOG_FILE=/tmp/llamacpp.log 18 + LLAMA_LOG_VERBOSITY=3 19 + 20 + llama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q4_K_S \ 21 + -ngl 99 \ 22 + --ubatch-size 512 --batch-size 2048 \ 23 + --ctx-size 2048 \ 24 + --cache-ram 0 \ 25 + --reasoning-budget 1024 \ 26 + --threads 8 \ 27 + --fit off \ 28 + --device CUDA0 29 + ``` 30 + 7 31 ## disclaimer 8 32 9 33 - coded with an LLM, adjusted, refactored, verified by hand.