...As the instruct-tuned FP8 variant, it is optimized for reliable instruction following, agentic workflows, production-grade assistants, and long-context enterprise tasks. It incorporates a massive 673B-parameter language MoE backbone and a 2.5B-parameter vision encoder, enabling rich multimodal understanding across text and images. The model supports dozens of languages and maintains strong system-prompt adherence, making it suitable for global and structured enterprise use. Designed for high performance, it runs on a single node of B200 or H200 GPUs in FP8, and can also operate in NVFP4 mode on H100 or A100 hardware. With a 256k context window, it excels at long-document comprehension, deep retrieval workflows, and complex knowledge-intensive tasks.