An open-source toolkit for monitoring Language Learning Models (LLMs)
Synchronized Translation for Videos
CLIP, Predict the most relevant text snippet given an image
Multimodal embedding and reranking models built on Qwen3-VL
Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML
A simple, high-quality voice conversion tool focused on ease of use
Speech recognition module for Python
Python tool for converting files and office documents to Markdown
Official MiniMax Model Context Protocol (MCP) server
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Controllable and fast Text-to-Speech for over 7000 languages
Free, high-quality text-to-speech API endpoint to replace OpenAI
Build AI-powered applications with React, Svelte, Vue, and Solid
LLM Frontend for Power Users
Implementation of Imagen, Google's Text-to-Image Neural Network
Text mining using tidy tools
The pluggable natural language linter for text and markdown
Toolkit for conversational AI
Parse files for optimal RAG
Generate blog articles from video or audio
Implementation of Phenaki Video, which uses Mask GIT
Chat & pretrained large audio language model proposed by Alibaba Cloud
A robust, efficient, low-latency speech-to-text library
Gp.nvim (GPT prompt) Neovim AI plugin
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning