A text editor in less than 1000 LOC with syntax highlight and search
Code for running inference and finetuning with SAM 3 model
Contexts Optical Compression
Code for openai.fm, a demo for the OpenAI Speech API
Qwen3-TTS is an open-source series of TTS models
A Family of Open Sourced Music Foundation Models
A Powerful Native Multimodal Model for Image Generation
OpenGL text using one vertex buffer, one texture and FreeType
Official inference repo for FLUX.2 models
Mozc - a Japanese Input Method Editor designed for multi-platform
A lightweight text-to-speech model with zero-shot voice cloning
Robust Speech Recognition via Large-Scale Weak Supervision
Open source text-to-speech tool, supports extra-long text
Python library and CLI tool to interface with Google Translate
Use Microsoft Edge's online text-to-speech service from Python
A high-quality rapid TTS voice cloning model
Image generation model with single-stream diffusion transformer
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Towards Human-Level Text-to-Speech through Style Diffusion
A robust, efficient, low-latency speech-to-text library
A Rich text editor library for both Jetpack Compose
Industrial-level controllable zero-shot text-to-speech system
PersonaPlex code
CLIP, Predict the most relevant text snippet given an image
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image