A Python library for audio
Audiocraft is a library for audio processing and generation
Multimodal Diffusion with Representation Alignment
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Official repository for LTX-Video
Workflow and speech recognition app
Python inference and LoRA trainer package for the LTX-2 audio–video
Speech-to-text, text-to-speech, and speaker recognition
Transform your voice in real-time voxal voice changer
Build AI-powered semantic search applications
The Triton Inference Server provides an optimized cloud
elevenlabs-api is an open source Java wrapper around the ElevenLabs
App in java for chatting to a generative A.I. (involving tts and stt)
Easy Tools of PDF, Image, File, Network, Data, and Medias
Integrate with the latest language models, image generation and speech
Common Resource Grep
Beauty can be applied to live broadcasts, short videos, and selfies
AlphaPlayer is a video animation engine
IPTV/NVR/CCTV/Video cloud https://fastocloud.com
Speech Recognition System
Music research software
ILA is a fully customizable and teachable voice assistant for Java
Ansj word segmentation