Showing 297 open source projects for "ace-step"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    ACE-Step 1.5

    ACE-Step 1.5

    The most powerful local music generation model

    ...Beyond straightforward text-to-music synthesis, ACE-Step 1.5 enables flexible creative workflows, including tasks like cover generation, editing existing tracks, transforming vocals to background accompaniment, and stylistic personalization using low-rank adaptation from just a few example songs.
    Downloads: 92 This Week
    Last Update:
    See Project
  • 2
    Step-Audio

    Step-Audio

    Open-source framework for intelligent speech interaction

    ...Through its architecture, Step-Audio supports multilingual interaction, dialects, emotional tones (joy, sadness, etc.), and even more creative speech styles (like rap or singing), while allowing dynamic control over speech characteristics. It also provides a “generative data engine,” which can produce synthetic speech data (cloning voices, varying style) to support TTS training.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Step 3.5 Flash

    Step 3.5 Flash

    Fast, Sharp & Reliable Agentic Intelligence

    Step 3.5 Flash is a cutting-edge, open-source large language model developed by StepFun-AI that pushes the frontier of efficient reasoning and “agentic” intelligence in a way that makes powerful AI accessible beyond proprietary black boxes. Unlike dense models that activate all their parameters for every token, Step 3.5 Flash uses a sparse Mixture-of-Experts (MoE) architecture that selectively engages only about 11 billion of its roughly 196 billion total parameters per token, delivering high-quality reasoning and interaction at far lower compute cost and latency than traditional large models. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    Step-Audio-EditX is an open-source, 3 billion-parameter audio model from StepFun AI designed to make expressive and precise editing of speech and audio as easy as text editing. Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level token operations. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    Step-Video-T2V

    Step-Video-T2V

    State-of-the-art (SoTA) text-to-video pre-trained model

    ...Its training and generation pipeline includes techniques like flow-matching, full 3D attention for temporal consistency, and fine-tuning approaches (e.g. video-based DPO) to improve fidelity and reduce artifacts. As a result, Step-Video-T2V aims to push the frontier of open-source video generation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Step-Audio 2

    Step-Audio 2

    Multi-modal large language model designed for audio understanding

    Step-Audio2 is an advanced, end-to-end multimodal large language model designed for high-fidelity audio understanding and natural speech conversation: unlike many pipelines that separate speech recognition, processing, and synthesis, Step-Audio2 processes raw audio, reasons about semantic and paralinguistic content (like emotion, speaker characteristics, non-verbal cues), and can generate contextually appropriate responses — including potentially generating or transforming audio output. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    TTS WebUI

    TTS WebUI

    A single Gradio + React WebUI with extensions for ACE-Step

    TTS-WebUI is a unified Gradio + React web interface that brings together a large ecosystem of text-to-speech, voice conversion, and audio generation models under a single UI. It supports a wide range of models such as Bark, MusicGen, Tortoise, RVC, StyleTTS2, ParlerTTS, CosyVoice, XTTSv2, Stable Audio, SeamlessM4T, and many others, exposing them as interchangeable backends for speech and music synthesis. The project provides an installer that sets up Conda, Python environments, and all...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 8
    Build Your Own OpenClaw

    Build Your Own OpenClaw

    A step-by-step guide to build your own AI agent

    Build Your Own OpenClaw is a step-by-step educational framework that teaches developers how to construct a fully functional AI agent system from scratch, gradually evolving from a simple chat loop into a multi-agent, production-ready architecture. The project is structured into 18 progressive stages, each introducing a new concept such as tool usage, memory persistence, event-driven design, and multi-agent coordination, with each step including both explanatory documentation and runnable code. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    GLM-5

    GLM-5

    From Vibe Coding to Agentic Engineering

    ...Building on earlier GLM series models, GLM-5 dramatically scales the parameter count (to roughly 744 billion) and expands pre-training data to significantly improve performance on complex tasks such as multi-step reasoning, software engineering workflows, and agent orchestration compared to its predecessors like GLM-4.5. It incorporates innovations like DeepSeek Sparse Attention (DSA) to preserve massive context windows while reducing deployment costs and supporting long context processing, which is crucial for detailed plans and agent tasks.
    Downloads: 246 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10
    Qwen3.5

    Qwen3.5

    Qwen3.5 is the large language model series developed by Qwen team

    Qwen3.5 is part of Alibaba’s Qwen family of large language and multimodal foundation models, designed to power advanced AI applications such as chatbots, coding assistants, and autonomous agents. The project represents a significant step toward “agentic AI,” meaning models that can reason through multi-step tasks and interact with external tools or environments rather than only generating text. Qwen3.5 builds on earlier Qwen generations by improving multilingual understanding, reasoning ability, and efficiency, while also introducing native multimodal capabilities that allow the model to work with both language and visual inputs. ...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 11
    self-llm

    self-llm

    Tutorial tailored for Chinese babies on rapid fine-tuning

    ...The repository focuses on helping beginners and developers understand how to run and customize modern LLMs locally rather than relying solely on hosted APIs. It provides step-by-step tutorials covering environment setup, model deployment, inference workflows, and efficient fine-tuning techniques such as LoRA and parameter-efficient training. The project also includes guides for integrating models into real applications, including command-line interfaces, web demos, and frameworks like LangChain. By combining theory, configuration instructions, and runnable examples, self-llm lowers the barrier to entry for students and engineers who want to experiment with open-source models.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    LLMs-from-scratch

    LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    LLMs-from-scratch is an educational codebase that walks through implementing modern large-language-model components step by step. It emphasizes building blocks—tokenization, embeddings, attention, feed-forward layers, normalization, and training loops—so learners understand not just how to use a model but how it works internally. The repository favors clear Python and NumPy or PyTorch implementations that can be run and modified without heavyweight frameworks obscuring the logic. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Kimi K2.5

    Kimi K2.5

    Moonshot's most powerful AI model

    ...Based on a 1T-parameter Mixture-of-Experts (MoE) architecture with 32B activated parameters, it integrates advanced language reasoning with strong visual understanding. K2.5 supports both “Thinking” and “Instant” modes, enabling either deep step-by-step reasoning or low-latency responses depending on the task. Designed for agentic workflows, it features an Agent Swarm mechanism that decomposes complex problems into coordinated sub-agents executing in parallel. With a 256K context length and MoonViT vision encoder, the model excels across reasoning, coding, long-context comprehension, image, and video benchmarks. ...
    Downloads: 43 This Week
    Last Update:
    See Project
  • 14
    verl-agent

    verl-agent

    Designed for training LLM/VLM agents via RL

    ...Developers can configure memory modules that determine how historical information is stored and incorporated into each step of the reasoning process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Langflow

    Langflow

    Low-code app builder for RAG and multi-agent AI applications

    Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 16
    Claude Code Architecture Study

    Claude Code Architecture Study

    Research on Coding Agents

    ...The project focuses on breaking down the architecture of agentic systems, including how models perceive context, make decisions, and execute actions in a coding environment. It likely provides step-by-step examples, conceptual explanations, and practical implementations that guide users through creating their own agents. The framework emphasizes learning by doing, allowing users to experiment with agent behavior, prompt design, and workflow structuring. It also explores how agents interact with tools such as file systems, terminals, and APIs, giving a holistic view of real-world applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    xiaohongshu-ops

    xiaohongshu-ops

    Turn Openclaw into a Xiaohongshu operations assistant

    ...It also provides practical frameworks for increasing visibility, improving content performance, and leveraging trends effectively. The content is organized to support both beginners and experienced operators, offering step-by-step strategies as well as advanced growth tactics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    thorough-pytorch

    thorough-pytorch

    PyTorch Getting Started Tutorial, read online

    ...It emphasizes a learning approach that combines theoretical explanations with hands-on coding exercises so that students can build and experiment with neural networks directly. The project encourages collaborative learning and often organizes materials in a step-by-step progression that gradually increases in complexity. Topics include neural network fundamentals, training procedures, model evaluation, and practical deep learning workflows. By combining structured lessons with programming projects, the repository aims to help learners develop both conceptual understanding and practical implementation skills.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    GPT PILOT

    GPT PILOT

    The first real AI developer

    ...Unlike simple autocomplete tools, it aims to function as a true AI engineer that can generate features, set up environments, debug code, and request feedback when necessary. The system works by asking clarifying questions, producing product requirements, and then implementing the application step by step while the user supervises. It powers the Pythagora VS Code extension and relies on coordinated AI agents that mimic roles in a real development workflow. GPT Pilot is intended to automate the majority of routine coding work while leaving strategic decisions and final review to the human developer. Overall, the project represents an ambitious attempt to move from AI coding assistance toward semi-autonomous software development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Agent SOP

    Agent SOP

    Natural language workflows for AI agents

    Agent SOP is a framework that implements structured operational procedures (SOPs) for autonomous agents so that they can carry out complex multi-step tasks reliably and in a defined order. Instead of relying solely on broad language model reasoning, this project enforces explicit step sequences with checkpoints, conditional transitions, and rollback logic, making agent workflows more predictable and auditable. It defines reusable SOP templates that agents can instantiate with context-specific parameters, allowing organizations to codify best practices for customer support, data processing, document workflows, or incident response. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    pi-autoresearch

    pi-autoresearch

    Autonomous experiment loop extension for pi

    ...The system likely integrates with external data sources or APIs to retrieve information and process it into structured insights. Its architecture suggests a focus on autonomy, allowing it to run multi-step research pipelines that mimic human investigative processes. This makes it particularly useful for exploratory analysis, trend discovery, or generating structured knowledge from large information spaces. Overall, pi-autoresearch represents a step toward self-directed research agents capable of producing increasingly refined outputs over time.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Browser Agent

    Browser Agent

    AI Browser Agent is an advanced Browser AI tool

    ...The tool allows developers to describe tasks in plain English, such as navigating pages, clicking elements, filling forms, and extracting data, and the system executes those actions as if a human were interacting with the browser. It is designed to simplify complex automation workflows by removing the need for manually written selectors or step-by-step scripts. The agent supports multi-step task execution, enabling it to perform sequences of actions across multiple pages while maintaining context. It also provides structured output formats such as JSON, HTML, Markdown, or screenshots, making it easy to integrate results into other systems or pipelines. Because it can interact with dynamic, JavaScript-heavy websites, it is suitable for modern web scraping and automation tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    plexe

    plexe

    Build a machine learning model from a prompt

    ...You describe what you want—a predictor, a classifier, a forecaster—and the tool plans data ingestion, feature preparation, model training, and evaluation automatically. Under the hood an agent executes the plan step by step, surfacing intermediate results and artifacts so you can inspect or override choices. It aims to be production-minded: models can be exported, versioned, and deployed, with reports to explain performance and limitations. The project supports both a Python library and a managed cloud option, meeting teams wherever they prefer to run workloads. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    OpenAI Quickstart Node

    OpenAI Quickstart Node

    Node.js example app from the OpenAI API quickstart tutorial

    ...The project is a practical starting point for building AI-powered applications, serving as a foundation for experimentation and integration into larger projects. It simplifies onboarding by offering step-by-step setup instructions and ready-to-use code snippets that can be adapted for custom needs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Generative AI for Beginners .NET

    Generative AI for Beginners .NET

    Hands-on .NET course for building real-world generative AI apps

    ...It walks through core concepts such as text generation, chat-based interactions, and integrating large language models into applications. Each lesson includes short videos, working code samples, and step-by-step instructions, making it easy to follow and apply immediately. Generative AI for Beginners .NET supports tools like GitHub Models, Azure OpenAI Service, and local models, giving flexibility in how projects are built and tested. Developers can run examples locally or in cloud-based environments such as GitHub Codespaces. It focuses on practical implementation rather than theory, helping users move from simple experiments to complete AI-powered solutions while understanding responsible AI usage and modern development workflows.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB