NeuroServicesNews

GGUF

Infrastructure

A file format for quantized models used by llama.cpp and other tools for running LLMs locally. Supports various quantization levels (Q4, Q5, Q8).

Related terms

Quantization llama.cpp