...It combines a 13.5B-parameter language model with a 0.4B-parameter vision encoder, enabling strong multimodal understanding in both text and image tasks. This FP8 instruct-tuned variant is designed specifically for chat, instruction following, and agentic workflows with robust system-prompt adherence. Despite its size, the model is engineered for practical deployment, capable of running locally on a single 24GB GPU when served in FP8 and even less with further quantization. Its multilingual support spans dozens of major languages, making it suitable for global, multilingual, and localized AI applications. ...