llamafu — cognisoc

llamafu — cognisocllamafu is a Flutter FFI plugin built on llama.cpp for running GGUF large language models on Android and iOS devices, fully on-device.https://llamafu.cognisoc.com/Structured output on-device with grammars and schemashttps://llamafu.cognisoc.com/blog/structured-output-on-device/https://llamafu.cognisoc.com/blog/structured-output-on-device/How llamafu uses GBNF grammars and JSON schemas to get reliable structured output from quantized models running on a phone.Wed, 11 Mar 2026 00:00:00 GMTPicking a GGUF quantization for mobilehttps://llamafu.cognisoc.com/blog/picking-gguf-quantization/https://llamafu.cognisoc.com/blog/picking-gguf-quantization/Q4_K_M, Q4_0, Q8_0 — what they mean, when to pick which, and how llamafu lets you load any of them.Wed, 04 Feb 2026 00:00:00 GMTWhy on-device LLMs are no longer a science projecthttps://llamafu.cognisoc.com/blog/why-on-device-llms/https://llamafu.cognisoc.com/blog/why-on-device-llms/Four practical reasons to move LLM inference onto the phone, and what llamafu gives you to do it on Flutter.Wed, 14 Jan 2026 00:00:00 GMT