Mixtral-8x7B-Instruct-v0.1 download

Mixtral-8x7B-Instruct-v0.1 is an instruction-tuned large language model developed by Mistral AI, based on a Sparse Mixture of Experts (MoE) architecture where only 2 of 8 expert models are active per forward pass. With a total of 46.7 billion parameters, it delivers the capabilities of a much larger model while remaining compute-efficient. Fine-tuned for multi-turn conversations, it follows a strict instruction formatting pattern using [INST] and [/INST] tags, and demonstrates superior performance over Llama 2 70B on several benchmarks. The model is accessible via Hugging Face Transformers and supports inference with tools like Flash Attention 2 and bitsandbytes for low-precision runs. It outputs coherent, contextually appropriate responses in up to 5 languages and is suitable for chat-based tasks in both research and production environments. However, it lacks built-in moderation or alignment safeguards, requiring external guardrails for safe deployment.

Features

Sparse Mixture of Experts with 2-of-8 active experts
46.7B total parameters with 12.9B active per token
Outperforms Llama 2 70B on many benchmarks
Instruction-tuned for coherent multi-turn dialogue
Efficient inference with Flash Attention 2 and bitsandbytes
Supports Hugging Face Transformers and vLLM integration
Openly licensed under Apache 2.0
Outputs in 5 supported languages with conversational tone

Project Samples

Project Activity

See All Activity >

Follow Mixtral-8x7B-Instruct-v0.1

Mixtral-8x7B-Instruct-v0.1 Web Site

Other Useful Business Software

Gen AI apps are built with MongoDB Atlas

Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.

Start Free

Rate This Project

User Reviews

Be the first to post a review of Mixtral-8x7B-Instruct-v0.1!

Additional Project Details

Registered

2025-06-27

Similar Business Software

Kimi K2

Kimi K2 is a state-of-the-art open source large language model series built on a mixture-of-experts (MoE) architecture, featuring 1 trillion total parameters and 32 billion activated parameters for task-specific efficiency. Trained with the Muon optimizer on over 15.5 trillion tokens and...

See Software
DeepSeek-V2

DeepSeek-V2 is a state-of-the-art Mixture-of-Experts (MoE) language model introduced by DeepSeek-AI, characterized by its economical training and efficient inference capabilities. With a total of 236 billion parameters, of which only 21 billion are active per token, it supports a context length...

See Software
Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud. Qwen2 is a series of large language models developed by the Qwen team at Alibaba Cloud. It includes both base language models and instruction-tuned models, ranging from 0.5 billion to 72 billion parameters, and...

See Software