GLM-4.7-Flash
GLM-4.7 Flash is a lightweight variant of GLM-4.7, Z.ai’s flagship large language model designed for advanced coding, reasoning, and multi-step task execution with strong agentic performance and a very large context window. It is an MoE-based model optimized for efficient inference that balances performance and resource use, enabling deployment on local machines with moderate memory requirements while maintaining deep reasoning, coding, and agentic task abilities. GLM-4.7 itself advances over earlier generations with enhanced programming capabilities, stable multi-step reasoning, context preservation across turns, and improved tool-calling workflows, and supports very long context lengths (up to ~200 K tokens) for complex tasks that span large inputs or outputs. The Flash variant retains many of these strengths in a smaller footprint, offering competitive benchmark performance in coding and reasoning tasks for models in its size class.
Learn more
Aion 1.0 Instruct
Aion-1.0-Instruct is a pre-release small language model introduced in Microsoft Edge as a developer preview for early testing and feedback. It is designed to power Edge’s on-device Prompt and Writing Assistance APIs, giving web developers a faster, smaller, and more efficient model for AI-powered browser experiences. Microsoft previously used Phi-4-mini for these APIs, but its hardware requirements limited availability across devices. Aion-1.0-Instruct expands support to significantly more devices, including machines with less capable GPUs and, through CPU inference, devices without a GPU, while still delivering strong quality for a wide range of web use cases. The model is available in Edge Canary and Dev channels, allowing developers to evaluate it in real-world web scenarios, test API interoperability, and provide feedback before final optimizations. Aion-1.0-Instruct is meant to help developers build AI features directly into websites and browser extensions.
Learn more
Ministral 8B
Mistral AI has introduced two advanced models for on-device computing and edge applications, named "les Ministraux": Ministral 3B and Ministral 8B. These models excel in knowledge, commonsense reasoning, function-calling, and efficiency within the sub-10B parameter range. They support up to 128k context length and are designed for various applications, including on-device translation, offline smart assistants, local analytics, and autonomous robotics. Ministral 8B features an interleaved sliding-window attention pattern for faster and more memory-efficient inference. Both models can function as intermediaries in multi-step agentic workflows, handling tasks like input parsing, task routing, and API calls based on user intent with low latency and cost. Benchmark evaluations indicate that les Ministraux consistently outperforms comparable models across multiple tasks. As of October 16, 2024, both models are available, with Ministral 8B priced at $0.1 per million tokens.
Learn more
AionUi
AionUi is a desktop workspace where AI agents live on the user’s computer and actually collaborate across everyday tasks such as writing code, making slides, sorting files, crunching numbers, editing photos, creating reports, writing papers, and running automations 24/7. Users can work with one agent, run multiple agents in parallel, assign tasks to the right assistant, or team them up inside one unified workspace. AionUi auto-detects Claude Code, Codex, Gemini CLI, Aion CLI, OpenCode, OpenClaw, Goose, and 20+ more tools already installed on the machine, so users can reuse their existing setup without reinstalling or duplicating tools. It includes 20+ built-in assistants for presentations, Excel, financial models, documents, academic papers, diagrams, UI/UX design, games, creative writing, project planning, recruiting, setup, and autonomous end-to-end work. Users can also create custom assistants tailored to their workflow.
Learn more