Kilo Code Integrations

Google Cloud Platform

Google

Google Cloud is a cloud-based service that allows you to create anything from simple websites to complex applications for businesses of all sizes. New customers get $300 in free credits to run, test, and deploy workloads. All customers can use 25+ products for free, up to monthly usage limits. Use Google's core infrastructure, data analytics & machine learning. Secure and fully featured for all enterprises. Tap into big data to find answers faster and build better products. Grow from prototype to production to planet-scale, without having to think about capacity, reliability or performance. From virtual machines with proven price/performance advantages to a fully managed app development platform. Scalable, resilient, high performance object storage and databases for your applications. State-of-the-art software-defined networking products on Google’s private fiber network. Fully managed data warehousing, batch and stream processing, data exploration, Hadoop/Spark, and messaging.

61,012 Ratings

Starting Price: Free ($300 in free credits)

View Software

Visit Website

GLM-5.2

Zhipu AI

GLM-5.2 is an advanced AI foundation model designed to support complex reasoning, coding, and long-range agentic tasks. It helps developers, teams, and organizations build intelligent systems that can understand instructions, solve technical problems, and assist with demanding workflows. The model is especially useful for software engineering, automation, research, and productivity-focused applications. GLM-5.2 is built to handle large amounts of context, making it suitable for projects that require deeper understanding across extended conversations, documents, or codebases. Its mixture-of-experts design helps balance strong performance with more efficient model operation. GLM-5.2 gives businesses and developers a powerful AI tool for creating smarter applications, improving technical workflows, and supporting advanced digital experiences.

1 Rating

Starting Price: Free

View Software

Laguna S 2.1

Poolside

Laguna S 2.1 is an open weight agentic coding model designed to pursue longer-horizon work and make effective use of reasoning. It uses a 118-billion-parameter Mixture-of-Experts architecture with 8 billion active parameters per token and supports a context window of up to one million tokens in both thinking and no-thinking modes. Its compact active size makes it suitable for complex work on local machines while remaining competitive with models many times larger on terminal, software-engineering, codebase-question-answering, and tool-use benchmarks. Laguna S 2.1 is built to keep working through difficult tasks with greater persistence, verification, and willingness to backtrack instead of declaring success too early. In demonstrated runs, it built and validated a browser rendering engine from an empty folder, optimized an agent harness for faster execution and substantially lower memory allocation, and completed extended mathematical research using the tools in its environment.

1 Rating

View Software

Seed2.1 Pro

ByteDance

Seed2.1 Pro is a next-generation AI productivity model built to handle complex, real-world work across general agents, code engineering, and multimodal understanding. It reliably executes multi-step tasks for high-value office work and everyday consultation, including project planning, file processing, research, tool use, spreadsheet analysis, lesson-plan slide generation, and industry report creation across tools and environments. In software development workflows, Seed2.1 Pro strengthens end-to-end delivery by improving requirement understanding, architecture design, coding, debugging, implementation, and validation. Its agent capabilities are designed to make steady progress on difficult tasks and return practical, verifiable results rather than isolated responses. The model also advances knowledge, reasoning, visual understanding, spatial reasoning, and long-context processing, giving agents a stronger foundation for complex decision-making and execution.

View Software

MiniMax M3

MiniMax

MiniMax M3 is an open-weight multimodal AI model designed for coding, agentic workflows, long-context reasoning, and complex automation tasks. The model combines frontier-level coding performance, native multimodal understanding, and a context window of up to 1 million tokens. MiniMax M3 uses MiniMax Sparse Attention to improve long-context efficiency while reducing compute requirements for large-scale inputs. It supports text, image, and video understanding, making it useful for workflows that combine code, documents, visual references, and tool-driven tasks. The model is built for repository-scale reasoning, software engineering, autonomous task execution, tool calling, and multi-step agent workflows. MiniMax M3 helps developers, AI teams, and enterprises build capable agents that can reason across large contexts and work with multimodal information.

Starting Price: Free

View Software

OpenRouter

OpenRouter is a unified interface for LLMs. OpenRouter scouts for the lowest prices and best latencies/throughputs across dozens of providers, and lets you choose how to prioritize them. No need to change your code when switching between models or providers. You can even let users choose and pay for their own. Evals are flawed; instead, compare models by how often they're used for different purposes. Chat with multiple at once in the chatroom. Model usage can be paid by users, developers, or both, and may shift in availability. You can also fetch models, prices, and limits via API. OpenRouter routes requests to the best available providers for your model, given your preferences. By default, requests are load-balanced across the top providers to maximize uptime, but you can customize how this works using the provider object in the request body. Prioritize providers that have not seen significant outages in the last 10 seconds.

1 Rating

Starting Price: Free

View Software

Visual Studio Code

Microsoft

Visual Studio Code (VS Code) is Microsoft’s open-source AI code editor designed to make coding faster, smarter, and more collaborative. It supports thousands of extensions and nearly every programming language, offering developers a lightweight yet powerful environment for writing, testing, and debugging code. With AI-powered features like GitHub Copilot, Next Edit Suggestions, and Agent Mode, VS Code helps you code with precision, automate complex tasks, and streamline development workflows. It integrates seamlessly with cloud services, remote repositories, and tools like Git, Docker, and Azure. The editor is fully customizable, allowing you to personalize your layout, color themes, and keyboard shortcuts. Whether coding locally or in the browser, VS Code delivers a complete development experience for individuals and teams alike.

28 Ratings

Starting Price: Free

View Software

IntelliJ IDEA

JetBrains

IntelliJ IDEA is a professional-grade integrated development environment (IDE) primarily designed for Java and Kotlin development. It helps developers write code faster by automating routine tasks and providing smart coding assistance. The IDE supports the full software development lifecycle, from design and coding to testing and deployment. IntelliJ IDEA stays up to date with the latest language features, such as full support for Java 24 and Kotlin K2 mode. It offers a smooth and enjoyable workflow that helps developers stay focused and productive. The platform also emphasizes data privacy and security, complying with industry standards like SOC 2.

21 Ratings

Starting Price: $19.90 per user per month

View Software

Anaconda

Anaconda is an AI-native development platform that helps teams move from experimentation to production with trusted open-source packages, governed environments, and production-grade orchestration. The platform provides a secure foundation for Python, data science, machine learning, and AI development across the full model lifecycle. Anaconda Core helps teams manage complex Python dependencies with validated packages, automated security scanning, and intelligent conflict resolution. The Anaconda Platform supports governed AI development so organizations can reduce broken environments, stalled deployments, and unmanaged open-source risk. Its trusted distribution is used by millions of users, developers, contributors, organizations, and Fortune 500 companies. Built for enterprise AI teams, Anaconda helps organizations accelerate open-source AI innovation while maintaining control, security, and governance.

9 Ratings

View Software

SuperGrok

SpaceXAI

SuperGrok is a premium AI subscription service developed by xAI, built on advanced versions of the Grok language model. It provides access to more powerful AI capabilities compared to standard or free versions. The platform is designed for tasks such as advanced reasoning, coding, research, and content creation. SuperGrok includes multimodal functionality, allowing it to work with text, images, and other data types. It offers faster responses, higher usage limits, and longer conversation capabilities. Users can also access advanced tools like deep search, AI agents, and enhanced generation features. The service is optimized for professionals who require higher performance and deeper analysis. By combining improved models and expanded features, it delivers a more capable AI experience.

1 Rating

Starting Price: $30/month

View Software

Claude Sonnet 3.7

Anthropic

Claude Sonnet 3.7, developed by Anthropic, is a cutting-edge AI model that combines rapid response with deep reflective reasoning. This innovative model allows users to toggle between quick, efficient responses and more thoughtful, reflective answers, making it ideal for complex problem-solving. By allowing Claude to self-reflect before answering, it excels at tasks that require high-level reasoning and nuanced understanding. With its ability to engage in deeper thought processes, Claude Sonnet 3.7 enhances tasks such as coding, natural language processing, and critical thinking applications. Available across various platforms, it offers a powerful tool for professionals and organizations seeking a high-performance, adaptable AI.

1 Rating

Starting Price: Free

View Software

Grok Build

SpaceXAI

Grok Build is an AI-powered command-line development environment designed to help developers build, manage, and automate software projects more efficiently. The platform provides a fast and flicker-free CLI experience that supports planning, coding, reviewing, and coordinating tasks across multiple AI-powered agents. Grok Build can adapt to different workflows and user preferences through customizable skills and interface enhancements. Developers can use the platform to architect complex projects with plan viewers, subagents, and parallel task execution capabilities. The system also includes marketplaces that allow teams to share workflows, capabilities, and productivity tools across projects. Grok Build supports interactive coding assistance, interface refinement suggestions, and contextual prompts that help streamline development processes.

1 Rating

Starting Price: Free

View Software

Grok Build 0.1

SpaceXAI

Grok Build 0.1 is a specialized AI coding model from xAI designed for agentic software engineering workflows and multi-step development tasks. The model is optimized to help coding agents perform actions such as planning, debugging, implementing changes, and iterating on code rather than simply generating one-time code responses. It supports both text and image inputs while producing text-based outputs, making it useful for analyzing code, screenshots, and technical documentation. Grok Build 0.1 includes support for tool use, structured outputs, function calling, and large-context reasoning capabilities. With a context window of up to 256,000 tokens, the model can process large codebases and complex projects within a single workflow. The platform is built for developers and engineering teams seeking faster and more capable AI-assisted software development.

1 Rating

Starting Price: $1 per 1M tokens (input)

View Software

Grok 4.3

SpaceXAI

Grok 4.3 is the latest iteration of xAI’s Grok model, designed to deliver improved reasoning, real-time information access, and advanced task automation. It builds on earlier Grok 4 models by enhancing performance in complex problem-solving, coding, and analytical workflows. The model is integrated with real-time web and X (formerly Twitter) data, allowing it to provide up-to-date insights and answers. Grok 4.3 supports multimodal capabilities, enabling it to work with text, images, and other data types. It operates within the SuperGrok Heavy tier, offering access to more powerful compute and advanced features. The model is designed to handle long-context tasks and multi-step reasoning with greater accuracy. It also supports tool use and integrations, enabling it to interact with external systems and automate workflows. Overall, Grok 4.3 is positioned as a high-performance AI assistant for real-time, data-driven tasks.

1 Rating

View Software

Grok Code Fast 1

SpaceXAI

Grok Code Fast 1 is a high-speed, economical reasoning model designed specifically for agentic coding workflows. Unlike traditional models that can feel slow in tool-based loops, it delivers near-instant responses, excelling in everyday software development tasks. Built from scratch with a programming-rich corpus and refined on real-world pull requests, it supports languages like TypeScript, Python, Java, Rust, C++, and Go. Developers can use it for everything from zero-to-one project building to precise bug fixes and codebase Q&A. With optimized inference and caching techniques, it achieves impressive responsiveness and a 90%+ cache hit rate when integrated with partners like GitHub Copilot, Cursor, and Cline. Offered at just $0.20 per million input tokens and $1.50 per million output tokens, Grok Code Fast 1 strikes a strong balance between speed, performance, and affordability.

Starting Price: $0.20 per million input tokens

View Software

GLM-5

Zhipu AI

GLM-5 is Z.ai’s latest large language model built for complex systems engineering and long-horizon agentic tasks. It scales significantly beyond GLM-4.5, increasing total parameters and training data while integrating DeepSeek Sparse Attention to reduce deployment costs without sacrificing long-context capacity. The model combines enhanced pre-training with a new asynchronous reinforcement learning infrastructure called slime, improving training efficiency and post-training refinement. GLM-5 achieves best-in-class performance among open-source models across reasoning, coding, and agent benchmarks, narrowing the gap with leading frontier models. It ranks highly on evaluations such as Vending Bench 2, demonstrating strong long-term planning and operational capabilities. The model is open-sourced under the MIT License.

Starting Price: Free

View Software

GLM-5.1

Zhipu AI

GLM-5.1 is the latest iteration of Z.ai’s GLM series, designed as a frontier-level, agent-oriented AI model optimized for coding, reasoning, and long-horizon workflows. It builds on the GLM-5 architecture, which uses a Mixture-of-Experts (MoE) design to deliver high performance while keeping inference costs efficient, and is part of a broader push toward open-weight, developer-accessible models. A core focus of GLM-5.1 is enabling agentic behavior, meaning it can plan, execute, and iterate across multi-step tasks rather than simply responding to single prompts. It is specifically designed to handle complex workflows such as debugging code, navigating repositories, and executing chained operations with sustained context. Compared to earlier models, GLM-5.1 improves reliability in long interactions, maintaining coherence across extended sessions and reducing breakdowns in multi-step reasoning.

Starting Price: Free

View Software

omp

omp is an open source AI coding agent and development harness that provides developers with a powerful local environment for AI-assisted engineering. It connects AI models directly to IDE capabilities, debugging tools, code execution, language servers, browser automation, memory, and dozens of built-in development tools. It supports more than 40 AI providers while allowing developers to use a single interface across cloud and local language models. omp enhances coding performance with features such as intelligent code editing, parallel subagents, persistent execution environments, integrated debugging, and advanced code review workflows. It also includes collaborative sessions, local memory, workflow automation, browser control, and GitHub integration to streamline complex software development tasks. Built with a native Rust engine and designed for Windows, macOS, and Linux, omp helps developers build, debug, and maintain software.

Starting Price: Free

View Software

GLM-4.6

Zhipu AI

GLM-4.6 advances upon its predecessor with stronger reasoning, coding, and agentic capabilities: it demonstrates clear improvements in inferential performance, supports tool use during inference, and more effectively integrates into agent frameworks. In benchmark tests spanning reasoning, coding, and agents, GLM-4.6 outperforms GLM-4.5 and shows competitive strength against models such as DeepSeek-V3.2-Exp and Claude Sonnet 4, though it still trails Claude Sonnet 4.5 in pure coding performance. In real-world tests using an extended “CC-Bench” suite across front-end development, tool building, data analysis, and algorithmic tasks, GLM-4.6 beats GLM-4.5 and approaches parity with Claude Sonnet 4, winning ~48.6% of head-to-head comparisons, while also achieving ~15% better token efficiency. GLM-4.6 is available via the Z.ai API, and developers can integrate it as an LLM backend or agent core using the platform’s API.

Starting Price: Free

View Software

MiniMax M2

MiniMax

MiniMax M2 is an open source foundation model built specifically for agentic applications and coding workflows, striking a new balance of performance, speed, and cost. It excels in end-to-end development scenarios, handling programming, tool-calling, and complex, long-chain workflows with capabilities such as Python integration, while delivering inference speeds of around 100 tokens per second and offering API pricing at just ~8% of the cost of comparable proprietary models. The model supports “Lightning Mode” for high-speed, lightweight agent tasks, and “Pro Mode” for in-depth full-stack development, report generation, and web-based tool orchestration; its weights are fully open source and available for local deployment with vLLM or SGLang. MiniMax M2 positions itself as a production-ready model that enables agents to complete independent tasks, such as data analysis, programming, tool orchestration, and large-scale multi-step logic at real organizational scale.

Starting Price: $0.30 per million input tokens

View Software

Devstral 2

Mistral AI

Devstral 2 is a next-generation, open source agentic AI model tailored for software engineering: it doesn’t just suggest code snippets, it understands and acts across entire codebases, enabling multi-file edits, bug fixes, refactoring, dependency resolution, and context-aware code generation. The Devstral 2 family includes a large 123-billion-parameter model as well as a smaller 24-billion-parameter variant (“Devstral Small 2”), giving teams flexibility; the larger model excels in heavy-duty coding tasks requiring deep context, while the smaller one can run on more modest hardware. With a vast context window of up to 256 K tokens, Devstral 2 can reason across extensive repositories, track project history, and maintain a consistent understanding of lengthy files, an advantage for complex, real-world projects. The CLI tracks project metadata, Git statuses, and directory structure to give the model context, making “vibe-coding” more powerful.

Starting Price: Free

View Software

Devstral Small 2

Mistral AI

Devstral Small 2 is the compact, 24 billion-parameter variant of the new coding-focused model family from Mistral AI, released under the permissive Apache 2.0 license to enable both local deployment and API use. Alongside its larger sibling (Devstral 2), this model brings “agentic coding” capabilities to environments with modest compute: it supports a large 256K-token context window, enabling it to understand and make changes across entire codebases. On the standard code-generation benchmark (SWE-Bench Verified), Devstral Small 2 scores around 68.0%, placing it among open-weight models many times its size. Because of its reduced size and efficient design, Devstral Small 2 can run on a single GPU or even CPU-only setups, making it practical for developers, small teams, or hobbyists without access to data-center hardware. Despite its compact footprint, Devstral Small 2 retains key capabilities of larger models; it can reason across multiple files and track dependencies.

Starting Price: Free

View Software

GLM-4.6V

Zhipu AI

GLM-4.6V is a state-of-the-art open source multimodal vision-language model from the Z.ai (GLM-V) family designed for reasoning, perception, and action. It ships in two variants: a full-scale version (106B parameters) for cloud or high-performance clusters, and a lightweight “Flash” variant (9B) optimized for local deployment or low-latency use. GLM-4.6V supports a native context window of up to 128K tokens during training, enabling it to process very long documents or multimodal inputs. Crucially, it integrates native Function Calling, meaning the model can take images, screenshots, documents, or other visual media as input directly (without manual text conversion), reason about them, and trigger tool calls, bridging “visual perception” with “executable action.” This enables a wide spectrum of capabilities; interleaved image-and-text content generation (for example, combining document understanding with text summarization or generation of image-annotated responses).

Starting Price: Free

View Software

GLM-4.1V

Zhipu AI

GLM-4.1V is a vision-language model, providing a powerful, compact multimodal model designed for reasoning and perception across images, text, and documents. The 9-billion-parameter variant (GLM-4.1V-9B-Thinking) is built on the GLM-4-9B foundation and enhanced through a specialized training paradigm using Reinforcement Learning with Curriculum Sampling (RLCS). It supports a 64k-token context window and accepts high-resolution inputs (up to 4K images, any aspect ratio), enabling it to handle complex tasks such as optical character recognition, image captioning, chart and document parsing, video and scene understanding, GUI-agent workflows (e.g., interpreting screenshots, recognizing UI elements), and general vision-language reasoning. In benchmark evaluations at the 10 B-parameter scale, GLM-4.1V-9B-Thinking achieved top performance on 23 of 28 tasks.

Starting Price: Free

View Software

GLM-4.5V-Flash

Zhipu AI

GLM-4.5V-Flash is an open source vision-language model, designed to bring strong multimodal capabilities into a lightweight, deployable package. It supports image, video, document, and GUI inputs, enabling tasks such as scene understanding, chart and document parsing, screen reading, and multi-image analysis. Compared to larger models in the series, GLM-4.5V-Flash offers a compact footprint while retaining core VLM capabilities like visual reasoning, video understanding, GUI task handling, and complex document parsing. It can serve in “GUI agent” workflows, meaning it can interpret screenshots or desktop captures, recognize icons or UI elements, and assist with automated desktop or web-based tasks. Although it forgoes some of the largest-model performance gains, GLM-4.5V-Flash remains versatile for real-world multimodal tasks where efficiency, lower resource usage, and broad modality support are prioritized.

Starting Price: Free

View Software

GLM-4.5V

Zhipu AI

GLM-4.5V builds on the GLM-4.5-Air foundation, using a Mixture-of-Experts (MoE) architecture with 106 billion total parameters and 12 billion activation parameters. It achieves state-of-the-art performance among open-source VLMs of similar scale across 42 public benchmarks, excelling in image, video, document, and GUI-based tasks. It supports a broad range of multimodal capabilities, including image reasoning (scene understanding, spatial recognition, multi-image analysis), video understanding (segmentation, event recognition), complex chart and long-document parsing, GUI-agent workflows (screen reading, icon recognition, desktop automation), and precise visual grounding (e.g., locating objects and returning bounding boxes). GLM-4.5V also introduces a “Thinking Mode” switch, allowing users to choose between fast responses or deeper reasoning when needed.

Starting Price: Free

View Software

GLM-4.7

Zhipu AI

GLM-4.7 is an advanced large language model designed to significantly elevate coding, reasoning, and agentic task performance. It delivers major improvements over GLM-4.6 in multilingual coding, terminal-based tasks, and real-world software engineering benchmarks such as SWE-bench and Terminal Bench. GLM-4.7 supports “thinking before acting,” enabling more stable, accurate, and controllable behavior in complex coding and agent workflows. The model also introduces strong gains in UI and frontend generation, producing cleaner webpages, better layouts, and more polished slides. Enhanced tool-using capabilities allow GLM-4.7 to perform more effectively in web browsing, automation, and agent benchmarks. Its reasoning and mathematical performance has improved substantially, showing strong results on advanced evaluation suites. GLM-4.7 is available via Z.ai, API platforms, coding agents, and local deployment for flexible adoption.

Starting Price: Free

View Software

MiniMax-M2.1

MiniMax

MiniMax-M2.1 is an open-source, agentic large language model designed for advanced coding, tool use, and long-horizon planning. It was released to the community to make high-performance AI agents more transparent, controllable, and accessible. The model is optimized for robustness in software engineering, instruction following, and complex multi-step workflows. MiniMax-M2.1 supports multilingual development and performs strongly across real-world coding scenarios. It is suitable for building autonomous applications that require reasoning, planning, and execution. The model weights are fully open, enabling local deployment and customization. MiniMax-M2.1 represents a major step toward democratizing top-tier agent capabilities.

Starting Price: Free

View Software

Kilo Code Reviewer

Kilo Code

Kilo Code Reviewer is an AI-powered automated code review tool that analyzes pull requests the moment they are opened or updated, understands the changes in context, and provides actionable feedback, including inline comments, explanations, and suggestions to catch bugs, security issues, performance problems, style violations, test gaps, and documentation omissions before human review. It integrates with GitHub, GitLab, and (soon) Bitbucket, lets users choose from a wide selection of models and customize review strictness and focus areas to match team standards, and can be run locally in IDEs like VS Code or JetBrains to catch issues before commit. The setup is simple, connect a repository, select an AI model and review style, and the agent runs automatically on PRs, helping enforce coding standards consistently and complement human reviewers with instant, context-aware insights.

Starting Price: Free

View Software

MiniMax M2.5

MiniMax

MiniMax M2.5 is a frontier AI model engineered for real-world productivity across coding, agentic workflows, search, and office tasks. Extensively trained with reinforcement learning in hundreds of thousands of real-world environments, it achieves state-of-the-art performance in benchmarks such as SWE-Bench Verified and BrowseComp. The model demonstrates strong architectural thinking, decomposing complex problems before generating code across more than ten programming languages. M2.5 operates at high throughput speeds of up to 100 tokens per second, enabling faster completion of multi-step tasks. It is optimized for efficient reasoning, reducing token usage and execution time compared to previous versions. With dramatically lower pricing than competing frontier models, it delivers powerful performance at minimal cost. Integrated into MiniMax Agent, M2.5 supports professional-grade office workflows, financial modeling, and autonomous task execution.

Starting Price: Free

View Software

MiniMax M2.7

MiniMax

MiniMax M2.7 is an advanced AI model designed to enhance real-world productivity across coding, search, and office workflows. It is trained with reinforcement learning across numerous real-world environments, enabling it to handle complex, multi-step tasks effectively. The model excels in problem-solving by breaking down challenges before generating solutions across multiple programming languages. It delivers high-speed performance with rapid token generation, allowing tasks to be completed efficiently. With optimized reasoning and cost-effective pricing, it provides powerful capabilities while minimizing resource usage. It also achieves strong performance in software engineering benchmarks, reducing incident response time and improving development efficiency. Additionally, it supports advanced agentic workflows and professional-grade office tasks, making it highly versatile for modern work environments.

Starting Price: Free

View Software

MiMo-V2-Pro

Xiaomi Technology

Xiaomi MiMo-V2-Pro is a flagship AI foundation model designed to power real-world agentic workflows and complex task execution. It is built to function as the core intelligence behind agent systems, enabling orchestration of multi-step processes and production-level tasks. The model demonstrates strong capabilities in coding, tool usage, and search-based tasks, performing competitively on global benchmarks. With its large-scale architecture and extended context window, it can handle long and complex interactions efficiently. MiMo-V2-Pro is optimized for practical applications, delivering reliable performance across development, automation, and enterprise workflows.

Starting Price: $1/million tokens

View Software

KiloClaw

Kilo Code

KiloClaw is a fully managed, cloud-hosted version of the open source AI agent OpenClaw, designed to let users deploy and run a powerful autonomous AI assistant without handling infrastructure, setup, or maintenance. It provides a one-click deployment experience where users can launch a working AI agent in under 60 seconds, eliminating the need for Docker, servers, SSH configuration, or manual environment setup. It runs on Kilo’s infrastructure and connects to more than 500 AI models through the Kilo Gateway, allowing users to switch between models or bring their own API keys while maintaining a unified system for billing and management. KiloClaw agents are capable of performing real actions rather than just generating text, including browsing the web, running commands, managing files, scheduling tasks, and interacting across chat platforms such as Telegram, Discord, and Slack.

Starting Price: $4 per month

View Software

OpenSpec

Fission AI

OpenSpec is an open-source spec-driven development framework designed to bring structure and clarity to AI-assisted coding workflows. It introduces a lightweight specification layer that helps teams define requirements before writing code. The platform organizes each change into structured artifacts such as proposals, specifications, designs, and task lists. It integrates with over 20 AI coding tools, allowing developers to use their preferred assistants while maintaining consistency. OpenSpec emphasizes an iterative and flexible approach rather than rigid development phases. Its command-based workflow enables users to propose, implement, and archive features efficiently. Overall, OpenSpec helps developers align with AI systems, reduce ambiguity, and produce more predictable and reliable outcomes.

Starting Price: Free

View Software

Mercury Edit 2

Inception

Mercury Edit 2 is part of Inception Labs’ Mercury family of AI models, designed to perform high-speed reasoning, coding, and editing tasks using a fundamentally different architecture from traditional large language models. It builds on Mercury 2, a diffusion-based reasoning model that generates and refines entire outputs in parallel rather than producing text token by token, enabling significantly faster performance and more responsive editing workflows. Instead of acting like a sequential “typewriter,” the system behaves more like an editor, starting with a rough draft and iteratively improving it across multiple tokens at once, which allows for real-time interaction and rapid iteration in tasks such as code editing, content generation, and agent-based workflows. This architecture delivers throughput of up to around 1,000 tokens per second, making it several times faster than conventional models while maintaining competitive reasoning quality across benchmarks.

Starting Price: $0.25 per 1M input tokens

View Software

Constellation

ShiftinBits Inc

Graph-backed code intelligence for your AI assistant. Constellation turns your codebase into a queryable knowledge graph, giving AI assistants the structural understanding they need to reason about real software — not just the plain text. Why Constellation? Text search tells you where a string appears, *everywhere* that string appears. Constellation tells you the exact location of the symbol in question, what it means, what calls it, and what breaks if you change it. Before your assistant edits a function, it can ask: - Where is this defined, and where is it used across the codebase? - What's the blast radius of this change? - Which modules have circular dependencies or dead code? - How does data flow through the call graph? Answers come from a semantic graph, not a grep loop. One Tool, Countless Capabilities A single `code_intel` tool exposes a rich JavaScript API as a "Code Mode" tool, allowing AI agents to craft complex composite queries.

Starting Price: $29.99/month

View Software

Laguna XS.2

Poolside

Laguna XS.2 is Poolside’s open-weight agentic coding model, built as the lightest and fastest model in the Laguna family. It is a 33B total-parameter Mixture of Experts model with 3B activated parameters, trained completely in-house on 30T tokens. As Poolside’s newest generation model open to the community, Laguna XS.2 is a second-generation architecture and the company’s first open-weight model, built on the lessons learned from training Laguna M.1 across synthetic data and reinforcement learning. The model is designed for agentic coding workflows, where it can code, act, iterate quickly, and perform best inside Poolside’s coding agent. Laguna XS.2 is positioned as a strong model for rapid agentic iteration, especially for developers and teams that need a compact, efficient coding model rather than a heavier frontier system. It is released under an Apache 2.0 license, allowing the community to evaluate, fine-tune, quantize, serve, and build on the weights.

Starting Price: Free

View Software

Laguna M.1

Poolside

Laguna M.1 is Poolside’s most capable model for agentic coding, built and trained in-house for software development workflows. It is a 225B total-parameter Mixture of Experts model with 23B activated parameters, trained completely in-house on 30T tokens using 6,144 interconnected NVIDIA H200 GPUs. Poolside trained Laguna M.1 from scratch with its own data work, training codebase, and async on-policy reinforcement learning in its agent harness, all with agentic coding in mind. The model is designed to perform at its best inside Poolside’s coding agent, where it can reason through software tasks, interact with tools, edit code, run tests, and support longer autonomous development sessions. Laguna M.1 is built for developers and teams working on complex coding tasks that require stronger reasoning, architectural understanding, terminal use, and multi-step execution than lightweight models can provide.

Starting Price: Free

View Software

Ling 2.6

Ant Group

Ling 2.6 is a general-purpose large language model series independently developed and open-sourced by Ant Group, built on a Mixture of Experts architecture and designed for inference efficiency, long context modeling, training technology, and AI Agent collaborative reasoning. Ling’s MoE architecture routes each token to activate only the most relevant expert subnetworks, compressing actual computation to a minimal fraction while maintaining large-scale model capacity. The Ling 2.6 series further advances long-sequence modeling, with Ling-2.6-1T supporting up to a 1M native context window and the official API exposing a 256K context window, while Ling-2.6-flash provides a native 256K context window capable of processing approximately 200,000 characters of long-form input. The models are designed for reliable long-range information retrieval, with no noticeable degradation whether information appears at the beginning, middle, or end of the context.

Starting Price: $0.0028 per 1M tokens

View Software

Ling 2.6 Flash

Ant Group

Ling 2.6 Flash is the latest cost-effective model in the Ling series, built on a Mixture of Experts architecture with 104B total parameters and 7.4B activated parameters. It is designed to achieve an optimal balance between inference performance and compute cost, making it suitable for general-purpose scenarios where strong reasoning capability, high throughput, and efficient deployment matter. Ling’s MoE architecture routes each token to activate only the most relevant expert subnetworks, compressing actual computation to a minimal fraction while maintaining large-scale model capacity. Ling 2.6 Flash provides a native 256K context window and can process approximately 200,000 characters of long-form input, with reliable long-range information retrieval whether key information appears at the beginning, middle, or end of the context. Its aggregate benchmark performance is comparable to or exceeds 40B-class Dense models.

Starting Price: $0.00037 per 1M tokens

View Software

Ring 2.6

Ant Group

Ring is a trillion-parameter thinking model from Ant Group, designed for real-world Agent workflows. It uses the same Mixture of Experts architecture as Ling, activating about 63B parameters per inference, and focuses on coding agents, tool use, multi-tool collaboration, engineering development, research analysis, and long-horizon task execution. Rather than only pursuing “smarter” results, Ring is built to consistently complete complex tasks at reasonable cost, balancing quality, speed, and execution efficiency in production environments. Ring-2.6-1T introduces an adjustable Reasoning Effort mechanism with high and xhigh reasoning intensity levels, using adaptive reasoning budget allocation based on task complexity. High mode is designed for high-frequency Agent workflows, lower token cost, faster multi-step execution, multi-turn interaction, tool collaboration, and task decomposition.

Starting Price: $0.0028 per 1M tokens

View Software

Tencent Hy

Tencent

Tencent HY is a self-developed, general-purpose, and multimodal large model family developed by Tencent, built to provide enterprise-grade AI services for content products, creative production, business automation, and real-world agent workflows. It covers language, image, 3D, translation, and other modalities, combining Tencent’s self-developed large model algorithms with natural language processing and computer vision technology to support higher-quality image creation, 3D generation, and intelligent content applications. Through Tencent Hunyuan AI Studio, users can interact with the model through natural human-computer dialogue, allowing the system to understand instructions, execute tasks, help users obtain information, generate content, and explore model capabilities in a practical workspace. Tencent HY supports API calls and custom parameter settings, making the model family easier to use for developers, product teams, and enterprise applications.

View Software

Ling 3.0 Flash

Ant Group

Ling 3.0 Flash is a next-generation efficient language model designed for long-horizon agent workflows, combining fast response, low activation, and stable tool use. It uses a Mixture-of-Experts architecture with 124 billion total parameters and 5.1 billion activated parameters per token, providing capability while keeping inference efficient. The model supports a native 256K context window that can be extended up to 1 million tokens, with reliable retrieval across information placed at the beginning, middle, or end of long contexts. Compared with the previous Flash model, Ling 3.0 Flash improves stability on extended tasks, tool-calling accuracy, instruction following, compatibility with agent harnesses, and coding performance. Its optimized spatial understanding can construct physical scene grids and reason about relative positions, while hybrid reasoning improves success rates across tasks of varying difficulty.

View Software

Seed2.0 Pro

ByteDance

Seed2.0 Pro is an advanced general-purpose agent model designed for large-scale production environments and complex real-world tasks. It focuses on long-chain inference capabilities and stability, making it ideal for handling multi-step workflows and intricate business applications. As part of the Seed 2.0 model series, it delivers major upgrades in multimodal understanding, including visual reasoning, motion perception, and instruction-following accuracy. The model demonstrates state-of-the-art performance across leading benchmarks in mathematics, science, coding, and visual reasoning. Seed2.0 Pro excels at interactive visual applications, such as recreating webpages from a single image and generating runnable front-end code with animations. It also supports professional workflows like CAD modeling, biotechnology research assistance, and structured data extraction from complex charts.

View Software

MiMo-V2.5-Pro

Xiaomi Technology

Xiaomi MiMo-V2.5-Pro is an advanced open-source AI model designed to handle complex, long-horizon tasks with strong agentic capabilities. It features a Mixture-of-Experts architecture with over one trillion parameters and a large context window of up to one million tokens. The model is built to perform sophisticated reasoning, coding, and problem-solving across extended workflows. It demonstrates high performance on benchmark tests related to software engineering, reasoning, and general intelligence. MiMo-V2.5-Pro can autonomously complete complex projects, such as building full software systems or optimizing engineering designs. It uses hybrid attention mechanisms to balance efficiency and performance across long contexts. The model is also optimized for token efficiency, reducing computational cost while maintaining strong results. By combining scalability, efficiency, and advanced reasoning, MiMo-V2.5-Pro represents a major step forward in open-source AI models.

View Software

MiMo-V2.5

Xiaomi Technology

Xiaomi MiMo-V2.5 is an advanced open-source AI model designed to combine strong agentic capabilities with native multimodal understanding. It can process and reason across text, images, and audio within a single unified system. The model uses a sparse Mixture-of-Experts architecture with hundreds of billions of parameters for efficient performance. It supports an extended context window of up to one million tokens, enabling long and complex workflows. MiMo-V2.5 is built to handle tasks such as coding, reasoning, and multimodal analysis with high accuracy. It incorporates dedicated visual and audio encoders to enhance perception and cross-modal reasoning. The model demonstrates strong benchmark performance across coding, reasoning, and multimodal tasks. By combining multimodality, efficiency, and agentic intelligence, MiMo-V2.5 advances the capabilities of open-source AI systems.

View Software

UnoRouter

UnoRouter is an OpenAI-compatible LLM gateway. One API key gives you 200+ models across providers (OpenAI, Anthropic, Google and more), drop-in for coding agents like Claude Code, Cline, Codex and Kilo Code. Point any OpenAI SDK at the base URL and switch models without changing code. UnoRouter also includes a built-in chat and character client (personas, lorebooks, SillyTavern card import) on the same key. Usage-based pricing with a free tier, live model and price data.

Starting Price: Free tier, usage-based

View Software

Ming-Flash Omni 2.0

Ant Group

Ming-Flash Omni 2.0 is a full-modal large language model from Ant Group, built on a unified multimodal architecture with “modal unity + task unity” as its core design philosophy. As part of the Ming series, it is designed to achieve cross-modal understanding and generation across text, images, audio, and video, allowing one model to see, hear, speak, and draw instead of relying on multiple specialized models. Ming-Flash Omni 2.0 follows the evolution of Ming-Light Omni and Ming-Flash Omni Preview, moving from unified architecture validation and hundred-billion-parameter scaling to a Data Scaling strategy that achieves open-source SOTA performance on multiple benchmarks. The model integrates four core capability modules: image-text understanding, video analysis, speech synthesis, and image generation or editing. For image-text understanding, Ming introduces structured knowledge graphs for fine-grained visual perception.

View Software

Seed2.1 Turbo

ByteDance

Seed2.1 Turbo is a next-generation AI productivity model designed to execute complex real-world tasks with strong general-agent, coding, and multimodal capabilities. It goes beyond one-off answers by carrying multi-step workflows toward defined goals and producing practical, usable outcomes across tools, environments, and interaction modes. For professional work and everyday consultation, it can support project planning, document and file processing, information analysis, solution design, content planning, tool use, and results consolidation. It also handles teaching, office, and research scenarios such as generating lesson-plan slides, analyzing complex spreadsheets, and producing industry reports. In software engineering, Seed2.1 Turbo supports end-to-end delivery across requirement analysis, feature implementation, bug fixing, environment setup, terminal usage, and result validation, while understanding codebase architecture, dependencies, and business logic to coordinate changes.

View Software

Laguna XS 2.1

Poolside

Laguna XS 2.1 is an upgraded open weight agentic coding model designed for long-horizon work on a local machine. It uses a 33-billion-parameter Mixture-of-Experts architecture with 3 billion activated parameters per token, retaining the same efficient architecture as Laguna XS.2 while improving multilingual software engineering and terminal-style task performance. The model is built to support coding agents that inspect repositories, reason through complex changes, use tools, execute commands, and continue working across extended tasks. It is served with a 256K context window, giving agents room to work with large codebases, lengthy histories, and multi-step workflows. Laguna XS 2.1 is supported by vLLM, SGLang, NVIDIA TensorRT-LLM, Hugging Face Transformers, and Ollama, with native llama.cpp support planned. It is available in BF16, FP8, INT4, and NVFP4 checkpoints, allowing developers to choose between maximum fidelity and configurations suited to tighter VRAM or compute budgets.

View Software

Kilo Code Integrations

51 Integrations with Kilo Code

Google Cloud Platform

GLM-5.2

Laguna S 2.1

Seed2.1 Pro

MiniMax M3

OpenRouter

Visual Studio Code

IntelliJ IDEA

Anaconda

SuperGrok

Claude Sonnet 3.7

Grok Build

Grok Build 0.1

Grok 4.3

Grok Code Fast 1

GLM-5

GLM-5.1

omp

GLM-4.6

MiniMax M2

Devstral 2

Devstral Small 2

GLM-4.6V

GLM-4.1V

GLM-4.5V-Flash

GLM-4.5V

GLM-4.7

MiniMax-M2.1

Kilo Code Reviewer

MiniMax M2.5

MiniMax M2.7

MiMo-V2-Pro

KiloClaw

OpenSpec

Mercury Edit 2

Constellation

Laguna XS.2

Laguna M.1

Ling 2.6

Ling 2.6 Flash

Ring 2.6

Tencent Hy

Ling 3.0 Flash

Seed2.0 Pro

MiMo-V2.5-Pro

MiMo-V2.5

UnoRouter

Ming-Flash Omni 2.0

Seed2.1 Turbo

Laguna XS 2.1

Related Categories

Related Categories That Integrate With Kilo Code