Large Audio Language Model built for natural interactions
Convert files and web content into clean, usable Markdown easily
The python library for real-time communication
A lightweight text-to-speech model with zero-shot voice cloning
Document Image Parsing via Heterogeneous Anchor Prompting”
MOSS‑TTS Family open‑source speech and sound generation model
A react-based starter app for using the Live API over websockets
Access to Anthropic's safety-first language model APIs
WhatsApp MCP server enabling AI access to chats and messaging
Tokenizer-Free TTS for Multilingual Speech Generation
StreamSpeech is a seamless model for offline speech recognition
Open source text-to-speech tool, supports extra-long text
A HTML5 video player with a parser that saves traffic
A nearly-live implementation of OpenAI's Whisper
Towards Human-Sounding Speech
A 0.1B Omni model trained from scratch
One-click deployment (including offline integration package)
AI-powered MCP server for desktop file and terminal automation
OpenAI Assistants API quickstart with Next.js
Cross-platform, customizable ML solutions
Python library for building agents that leverages Google Antigravity
MOSS-TTS-Nano is an open-source multilingual tiny speech generation
Real-time transport layer for Java AI agents
Provides convenient access to the Anthropic REST API from any Python 3
DeepSeek 4 Flash local inference engine for Metal