GPT-OSS-120B is a powerful open-weight language model by OpenAI, optimized for high-level reasoning, tool use, and agentic tasks. With 117B total parameters and 5.1B active parameters, it’s designed to fit on a single H100 GPU using native MXFP4 quantization. The model supports fine-tuning, chain-of-thought reasoning, and structured outputs, making it ideal for complex workflows. It operates in OpenAI’s Harmony response format and can be deployed via Transformers, vLLM, Ollama, LM Studio, and PyTorch. Developers can control the reasoning level (low, medium, high) to balance speed and depth depending on the task. Released under the Apache 2.0 license, it enables both commercial and research applications. The model supports function calling, web browsing, and code execution, streamlining intelligent agent development.
Features
- 117B parameters, 5.1B active (MoE)
- Harmony-format compatible for chat and agents
- Apache 2.0 license for free commercial use
- Chain-of-thought reasoning with adjustable depth
- Native support for tool use: browsing, code, functions
- Fine-tuning support on H100 or consumer-grade hardware
- Deployable via Transformers, vLLM, Ollama, and more
- Efficient inference using MXFP4 quantization Preguntar a ChatGPT