Showing 147 open source projects for "ace-step"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    Solace Agent Mesh

    Solace Agent Mesh

    An event-driven framework designed to build multi-agent AI systems

    Solace Agent Mesh is an event-driven framework designed to build, orchestrate, and scale multi-agent AI systems where specialized agents collaborate to solve complex tasks across distributed environments. It addresses one of the main challenges in modern AI systems, which is connecting isolated agents, data sources, and enterprise systems into a cohesive and interoperable ecosystem. The framework uses an asynchronous messaging architecture powered by an event broker, enabling agents to...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Agent Framework

    Agent Framework

    Framework for building, orchestrating, and deploying AI agents

    ...Microsoft Agent Framework supports graph-based orchestration that enables developers to connect agents, functions, and tools into structured workflows capable of handling multi-step processes. It also includes components such as agent sessions for managing state, context providers for maintaining memory, and middleware for intercepting and extending agent behavior. Developers can integrate external tools and services so that agents can execute actions beyond text generation.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Hephaestus

    Hephaestus

    Semi-Structured Agentic Framework. Workflows build themselves

    Hephaestus is an open-source semi-structured agentic framework designed to orchestrate multiple AI agents working together on complex tasks. Instead of relying entirely on predefined workflows, the framework allows agents to dynamically create tasks as they explore a problem space. Developers define high-level phases such as analysis, implementation, and testing, while agents generate specific subtasks within those phases. The system continuously monitors agent behavior and task progression,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    OAGI Python SDK

    OAGI Python SDK

    Python SDK for the Computer Use model Lux, developed by OpenAGI

    ...It exposes the OAGI API in an ergonomic way, letting you trigger Lux in three main modes: Tasker for precise scripted sequences, Actor for fast one-shot tasks, and Thinker for open-ended, multi-step objectives. The SDK is designed around “computer use” as a paradigm, where the AI actually navigates interfaces, clicks, types, scrolls, and reads the screen through screenshots instead of only calling APIs. It provides high-level asynchronous agents (like AsyncDefaultAgent and AsyncActor) that encapsulate the loop of capturing screenshots, sending them to Lux, interpreting responses, and executing UI actions with PyAutoGUI. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Agent S

    Agent S

    Agent S: an open agentic framework that uses computers like a human

    ...Built to operate graphical user interfaces like a human, it allows AI agents to perceive screens, reason about tasks, and execute actions across macOS, Windows, and Linux systems. The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines powerful foundation models (such as GPT-5) with grounding models like UI-TARS to translate visual inputs into precise executable actions. It supports flexible deployment via CLI, SDK, or cloud, and integrates with multiple model providers including OpenAI, Anthropic, Gemini, Azure, and Hugging Face endpoints. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    UI-TARS

    UI-TARS

    UI-TARS-desktop version that can operate on your local personal device

    ...Rather than relying on rigid, manually scripted UI automation, UI-TARS uses a unified vision-language model (VLM) that integrates perception, reasoning, grounding, and action into one end-to-end framework: it “thinks before acting,” enabling flexible, general-purpose automation. This allows it to perform complex, multi-step tasks such as filling forms, downloading files, navigating applications, and even controlling in-game actions — all by understanding the UI as a human would. The project is open-source, supports deployment locally or remotely, and offers a foundation for building GUI automation agents that are more robust, and adaptable.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Biomni

    Biomni

    Biomni: a general-purpose biomedical AI agent

    ...The system is built to support researchers by automating repetitive and time-consuming tasks such as literature review, data analysis, and experimental design. Biomni operates within a comprehensive environment that includes tools, APIs, and datasets, enabling it to execute multi-step research processes rather than just generating text responses. It supports integration with multiple AI models, allowing flexibility in selecting the most appropriate model for specific tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Karpathy

    Karpathy

    An agentic Machine Learning Engineer

    ...It is intended primarily for research and experimentation with autonomous ML workflows rather than as a polished production platform. Overall, karpathy represents an early step toward fully automated machine learning engineering driven by agentic AI systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Lingua-Py

    Lingua-Py

    The most accurate natural language detection library for Python

    Its task is simple: It tells you which language some text is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    NagaAgent

    NagaAgent

    A simple yet powerful agent framework for personal assistants

    ...The project includes mechanisms for semantic memory, reasoning pipelines, and integration points with external data sources and language models so that agents can interpret natural language instructions and produce coherent multi-step outputs. Rather than being a simple chatbot, NagaAgent emphasizes persistent thought cycles, context retention, and the ability to decompose complex tasks into smaller executable units, earning it a place in research explorations of agent design. Its architecture facilitates extensibility, allowing developers to plug in different reasoning modules or knowledge sources depending on the domain of use.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    ChatTTS_colab

    ChatTTS_colab

    One-click deployment (including offline integration package)

    ...The project also implements multi-speaker or role-based reading, letting users assign different voices to different characters in a script and even use a large language model to generate that script in one step.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    AI Agent Deep Dive

    AI Agent Deep Dive

    AI Agent Source Code Deep Research Report

    AI Agent Deep Dive is a comprehensive educational repository designed to provide a deep and structured understanding of how modern AI agents work, focusing on architecture, workflows, and real-world implementation patterns. It breaks down complex concepts such as planning, tool usage, memory management, and multi-step reasoning into digestible explanations and practical examples. The project is organized as a learning resource rather than a standalone framework, making it particularly useful for developers who want to move beyond surface-level prompt engineering into full agent system design. It explores how agents interact with environments, execute tasks, and maintain context over time, highlighting both strengths and limitations of current approaches. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AIBuildAI

    AIBuildAI

    An AI agent that automatically builds AI models

    AI-Build-AI is an open-source framework focused on enabling autonomous systems that can design, generate, and improve AI applications with minimal human intervention. The project explores recursive AI development, where models are used not only as tools but as builders capable of constructing other AI systems, workflows, or components. It provides a structured environment for orchestrating agents that can plan, execute, and refine tasks such as code generation, system design, and iterative...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Hello-Agents

    Hello-Agents

    Building an Intelligent Agent from Scratch

    ...The project focuses on guiding learners beyond superficial framework usage toward deeper comprehension of agent architecture, reasoning loops, and real-world implementation patterns. It walks users through core concepts such as ReAct-style reasoning, tool usage, memory handling, and multi-step task execution, enabling hands-on experimentation with modern LLM-powered agent systems. The repository is structured as a progressive learning path, combining theory, exercises, and runnable code so users can incrementally build more capable agents. Its goal is to demystify agent engineering and help developers move from simple prompt scripts to robust autonomous systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ToolUniverse

    ToolUniverse

    Democratizing AI scientists with ToolUniverse

    ...Instead of requiring custom pipelines or fine-tuning, ToolUniverse wraps around existing models and enables them to reason, experiment, and iterate on complex workflows such as drug discovery, data analysis, and hypothesis testing. The platform abstracts tool usage behind a consistent interface, allowing AI agents to compose multi-step workflows, refine tool definitions automatically, and even generate new tools from natural language descriptions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    LaVague

    LaVague

    Framework for building AI agents that automate complex web tasks

    ...It implements the concept of a Large Action Model framework, allowing agents to interpret a user-provided objective and translate it into a sequence of actions performed in a browser. These agents can navigate web pages, retrieve information, fill out forms, and execute multi-step workflows automatically. LaVague is centered around a World Model that analyzes the current webpage state and determines the next set of instructions, combined with an Action Engine that converts those instructions into executable automation code. It can use browser automation tools such as Selenium or Playwright to interact with websites programmatically. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MiroFlow

    MiroFlow

    Agent framework that enables tool-use agent tasks

    ...One of the core innovations of MiroFlow is its use of agent graphs, which enable flexible orchestration of multiple sub-agents and tools in order to complete complex workflows. This architecture allows agents to perform advanced reasoning tasks such as deep research, future event prediction, and multi-step knowledge analysis. The framework emphasizes reliability and scalability by incorporating robust workflow execution, concurrency management, and fault-tolerant design to handle unstable APIs or network conditions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    LLMCompiler

    LLMCompiler

    An LLM Compiler for Parallel Function Calling

    LLMCompiler is an open-source framework designed to optimize how large language models orchestrate multiple external tool or function calls during complex reasoning tasks. Traditional LLM agent systems typically execute tool calls sequentially, which can create latency, higher costs, and reduced reliability when solving multi-step problems. LLMCompiler addresses this limitation by applying principles from classical compilers to analyze a task and construct an execution plan that allows multiple functions to run in parallel whenever possible. The framework builds a dependency graph of required operations, identifying which tasks must run sequentially and which can be executed simultaneously. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Cradle framework

    Cradle framework

    The Cradle framework is a first attempt at General Computer Control

    Cradle is an open-source framework designed to enable AI agents to perform complex computer tasks by interacting with software environments in a way similar to human users. The system introduces the concept of General Computer Control, where AI agents receive screenshots as input and perform actions through simulated keyboard and mouse operations. This approach allows agents to interact with any software interface without relying on specialized APIs or predefined automation scripts. The...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Skywork-R1V4

    Skywork-R1V4

    Skywork-R1V is an advanced multimodal AI model series

    ...The project introduces a model architecture that transfers the reasoning abilities of advanced text-based models into visual domains so the system can interpret images and perform multi-step reasoning about them. Instead of retraining both language and vision models from scratch, the framework uses a lightweight visual projection layer that connects a pretrained vision backbone with a reasoning-capable language model. This design allows the model to analyze images while maintaining strong textual reasoning performance, enabling tasks such as solving visual math problems, interpreting scientific diagrams, and answering questions about images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    AgentBench

    AgentBench

    A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

    AgentBench is an open-source benchmark designed to evaluate the capabilities of large language models when used as autonomous agents. Unlike traditional language model benchmarks that focus on static text tasks, AgentBench measures how models perform in interactive environments that require planning, reasoning, and decision-making. The benchmark includes multiple environments that simulate realistic scenarios such as web interaction, database querying, and problem solving tasks. These...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Agents 2.0

    Agents 2.0

    An Open-source Framework for Data-centric Language Agents

    ...The project introduces a concept known as agent symbolic learning, which treats an agent pipeline similarly to a neural network computational graph. In this framework, each node in the pipeline represents a step in the reasoning or action process, while prompts and tools act as adjustable parameters analogous to neural network weights. During training, the system performs a forward execution where the agent completes a task and records the trajectory of prompts, outputs, and tool usage. A prompt-based loss function is then applied to evaluate the quality of the outcome, generating language-based gradients that guide improvements to the agent pipeline.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    AppAgent

    AppAgent

    Multimodal Agents as Smartphone Users, an LLM-based multimodal agent

    AppAgent is an open-source multimodal agent framework designed to enable large language models to operate smartphone applications through natural interactions with graphical user interfaces. The system allows an AI agent to interpret visual information from the screen and translate natural language instructions into actions such as tapping, swiping, and navigating between application screens. Instead of requiring backend access to application APIs, the framework interacts with apps the same...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    nanochat

    nanochat

    The best ChatGPT that $100 can buy

    ...Its north star is approachability and speed: you can boot a fresh GPU box and drive the whole pipeline via a single script, producing a usable chat model in hours and a clear markdown report of what happened. The code is written to be read—concise training loops, transparent configs, and minimal wrappers—so you can audit each step, tweak it, and rerun without getting lost in framework indirection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

    NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data. The vision-language model remains frozen during both pretraining and...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB