Wan2.1: Open and Advanced Large-Scale Video Generative Model
Robust Speech Recognition via Large-Scale Weak Supervision
Contexts Optical Compression
Stanford CoreNLP, a Java suite of core NLP tools
AI bridge enabling assistants to control and automate Unity Editor
High-Quality Voice Cloning TTS for 600+ Languages
A generative speech model for daily dialogue
Official inference repo for FLUX.2 models
Python tool for converting files and office documents to Markdown
Open source semantic search and text analytics for large document sets
A simple, high-quality voice conversion tool focused on ease of use
Offline Text To Speech synthesis for python
An easy 1-click way to create beautiful artwork on your PC using AI
Use Microsoft Edge's online text-to-speech service from Python
Automatic Speech Recognition with Word-level Timestamps
Lightning-fast, on-device TTS, running natively via ONNX
Generate audiobooks from e-books
Official MiniMax Model Context Protocol (MCP) server
Coding agent for DeepSeek models that runs in your terminal
Generate audiobooks from e-books, voice cloning & 1107+ languages
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
A Powerful Native Multimodal Model for Image Generation
Text and image to video generation: CogVideoX and CogVideo
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
CLIP, Predict the most relevant text snippet given an image