Framework for building realtime multimodal voice AI agents apps
Build and run agents you can see, understand and trust
Transforming Multimodal Content into Captivating Multilingual Audio
Context-aware desktop AI assistant that understands screen content
Run a full local LLM stack with one command using Docker
AI Slack bot for reading, summarizing, and chatting with content
MARS5 speech model (TTS) from CAMB.AI
SDG is a specialized framework
PyTorch3D is FAIR's library of reusable components for deep learning
Flowly is 100x faster than OpenClaw
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
A specialized Claude Code workspace for creating long-form
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Framework for building AI-powered interactive digital humans and agent
SoTA open-source TTS
A fast TTS architecture with conditional flow matching
Machine learning on FPGAs using HLS
Open Source Deep Research Alternative to Reason and Search
Generate audiobooks from e-books
Fully Local Manus AI. No APIs, No $200 monthly bills
Voice Recognition to Text Tool
Multilingual speech recognition and audio understanding model
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Package manager and build abstraction tool for FPGA/ASIC development
A Python library for audio