Grok-2.5
Large-scale xAI model for local inference with SGLang, Grok-2.5
...The model is distributed as raw weights that require specialized infrastructure to run, rather than being hosted by inference providers. To use it, users must download over 500 GB of files and set them up locally with the SGLang inference engine. Grok-2.5 supports advanced inference with multi-GPU configurations, requiring at least 8 GPUs with more than 40 GB of memory each for optimal performance. It integrates with the SGLang framework to enable serving, testing, and chat-style interactions. The model comes with a post-training architecture and requires the correct chat template to function properly. ...