GGUF support
#17
by
geboh67859
- opened
GGUF format will make your great work accessible to more users!
the mainline llama.cpp PR is here: https://github.com/ggml-org/llama.cpp/pull/16831
I got @DevQuasar Q8_0 quant working with the above PR with command to run here on the GGUF repos discussions: https://huggingface.co/DevQuasar/MiniMaxAI.MiniMax-M2-GGUF/discussions/1
Seems to be working okay, though be mindful of the unique interleaved thinking tag chat thread.