SuperGemma4 is a locally deployable large language model built on the Gemma 4 26B A4B instruction base, optimized for speed, flexibility, and less restricted conversational behavior. It is designed to provide a more open and natural chat experience compared to standard censored models, while still maintaining practical usability across general text, coding, and multilingual tasks, especially Korean. Unlike raw base models, it inherits improvements from the SuperGemma Fast line, resulting in better performance in logic, coding, and real-world text workflows. The model is packaged in GGUF format for efficient use with llama.cpp and has been specifically tested on Apple Silicon hardware, delivering high token speeds and smooth local inference. A neutral chat template is embedded to prevent prompt misrouting issues, ensuring consistent responses without unintended shifts into coding or tool-use modes.
Features
- Uncensored conversational behavior for more flexible outputs
- Optimized GGUF format for fast local inference
- High token generation speed on Apple Silicon devices
- Neutral chat template to prevent prompt misrouting
- Improved performance over base Gemma in coding and logic
- Supports multilingual tasks including Korean
- Balanced general chat and coding capabilities
- Compatible with llama.cpp for local deployment