GPT-4.1 mini
GPT-4.1 mini is a compact version of OpenAI’s powerful GPT-4.1 model, designed to provide high performance while significantly reducing latency and cost. With a smaller size and optimized architecture, GPT-4.1 mini still delivers impressive results in tasks such as coding, instruction following, and long-context processing. It supports up to 1 million tokens of context, making it an efficient solution for applications that require fast responses without sacrificing accuracy or depth.
Learn more
MiniMax M2
MiniMax M2 is an open source foundation model built specifically for agentic applications and coding workflows, striking a new balance of performance, speed, and cost. It excels in end-to-end development scenarios, handling programming, tool-calling, and complex, long-chain workflows with capabilities such as Python integration, while delivering inference speeds of around 100 tokens per second and offering API pricing at just ~8% of the cost of comparable proprietary models. The model supports “Lightning Mode” for high-speed, lightweight agent tasks, and “Pro Mode” for in-depth full-stack development, report generation, and web-based tool orchestration; its weights are fully open source and available for local deployment with vLLM or SGLang. MiniMax M2 positions itself as a production-ready model that enables agents to complete independent tasks, such as data analysis, programming, tool orchestration, and large-scale multi-step logic at real organizational scale.
Learn more
Olmo 3
Olmo 3 is a fully open model family spanning 7 billion and 32 billion parameter variants that delivers not only high-performing base, reasoning, instruction, and reinforcement-learning models, but also exposure of the entire model flow, including raw training data, intermediate checkpoints, training code, long-context support (65,536 token window), and provenance tooling. Starting with the Dolma 3 dataset (≈9 trillion tokens) and its disciplined mix of web text, scientific PDFs, code, and long-form documents, the pre-training, mid-training, and long-context phases shape the base models, which are then post-trained via supervised fine-tuning, direct preference optimisation, and RL with verifiable rewards to yield the Think and Instruct variants. The 32 B Think model is described as the strongest fully open reasoning model to date, competitively close to closed-weight peers in math, code, and complex reasoning.
Learn more
DeepSeek-V3.2-Speciale
DeepSeek-V3.2-Speciale is a high-compute variant of the DeepSeek-V3.2 model, created specifically for deep reasoning and advanced problem-solving tasks. It builds on DeepSeek Sparse Attention (DSA), a custom long-context attention mechanism that reduces computational overhead while preserving high performance. Through a large-scale reinforcement learning framework and extensive post-training compute, the Speciale variant surpasses GPT-5 on reasoning benchmarks and matches the capabilities of Gemini-3.0-Pro. The model achieved gold-medal performance in the International Mathematical Olympiad (IMO) 2025 and International Olympiad in Informatics (IOI) 2025. DeepSeek-V3.2-Speciale does not support tool-calling, making it purely optimized for uninterrupted reasoning and analytical accuracy. Released under the MIT license, it provides researchers and developers an open, state-of-the-art model focused entirely on high-precision reasoning.
Learn more