GGUF
< GlossaryA file format for quantized models used by llama.cpp and other tools for running LLMs locally. Supports various quantization levels (Q4, Q5, Q8).
A file format for quantized models used by llama.cpp and other tools for running LLMs locally. Supports various quantization levels (Q4, Q5, Q8).