Open-source AI agent framework
1 min voice data can also be used to train a good TTS model
Agentic, Reasoning, and Coding (ARC) foundation models
Data manipulation and transformation for audio signal processing
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
Awesome multilingual OCR toolkits based on PaddlePaddle
Official inference repo for FLUX.2 models
YOLOv5 is the world's most loved vision AI
Qwen3-TTS is an open-source series of TTS models
A set of ready to use Agent Skills for research, science, engineering
Python bindings for llama.cpp
AI Fully Automated Short Video Engine
Fast stable diffusion on CPU and AI PC
Advanced LLM-powered brute-force tool combining AI intelligence
Tokenizer-Free TTS for Multilingual Speech Generation
Faster Whisper transcription with CTranslate2
Automatic Speech Recognition with Word-level Timestamps
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Open-source multi-speaker long-form text-to-speech model
Fully automatic censorship removal for language models
Text and image to video generation: CogVideoX and CogVideo
Open-source autonomous AI software engineer
The official Meta Llama 3 GitHub site
Python inference and LoRA trainer package for the LTX-2 audio–video
GLM-4.5: Open-source LLM for intelligent agents by Z.ai