<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>llamafu — cognisoc</title><description>llamafu is a Flutter FFI plugin built on llama.cpp for running GGUF large language models on Android and iOS devices, fully on-device.</description><link>https://llamafu.cognisoc.com/</link><item><title>Structured output on-device with grammars and schemas</title><link>https://llamafu.cognisoc.com/blog/structured-output-on-device/</link><guid isPermaLink="true">https://llamafu.cognisoc.com/blog/structured-output-on-device/</guid><description>How llamafu uses GBNF grammars and JSON schemas to get reliable structured output from quantized models running on a phone.</description><pubDate>Wed, 11 Mar 2026 00:00:00 GMT</pubDate></item><item><title>Picking a GGUF quantization for mobile</title><link>https://llamafu.cognisoc.com/blog/picking-gguf-quantization/</link><guid isPermaLink="true">https://llamafu.cognisoc.com/blog/picking-gguf-quantization/</guid><description>Q4_K_M, Q4_0, Q8_0 — what they mean, when to pick which, and how llamafu lets you load any of them.</description><pubDate>Wed, 04 Feb 2026 00:00:00 GMT</pubDate></item><item><title>Why on-device LLMs are no longer a science project</title><link>https://llamafu.cognisoc.com/blog/why-on-device-llms/</link><guid isPermaLink="true">https://llamafu.cognisoc.com/blog/why-on-device-llms/</guid><description>Four practical reasons to move LLM inference onto the phone, and what llamafu gives you to do it on Flutter.</description><pubDate>Wed, 14 Jan 2026 00:00:00 GMT</pubDate></item></channel></rss>