High-performance inference server for text embeddings models API layer
Document (PDF, Word, PPTX ...) extraction and parse API
A text editor in less than 1000 LOC with syntax highlight and search
A playground to generate images from any text prompt using SD
Hypernetworks that adapt LLMs for specific benchmark tasks
Advanced translator plugin that can be used to translate Unity games
Claude Code skill that removes signs of AI-generated writing from text
Code for openai.fm, a demo for the OpenAI Speech API
OpenGL text using one vertex buffer, one texture and FreeType
TTS with kokoro and onnx runtime
The media player for language learning, with dual subtitles
Code for running inference and finetuning with SAM 3 model
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Contexts Optical Compression
Mozc - a Japanese Input Method Editor designed for multi-platform
JavaScript OCR and text extraction for images and PDFs
Tokenizer-Free TTS for Multilingual Speech Generation
Official inference repo for FLUX.2 models
Qwen3-TTS is an open-source series of TTS models
Robust Speech Recognition via Large-Scale Weak Supervision
A Powerful Native Multimodal Model for Image Generation
A Family of Open Sourced Music Foundation Models
A simple tool for reading in poorly redacted documents
Audiocraft is a library for audio processing and generation
A robust, efficient, low-latency speech-to-text library