Wan2.2: Open and Advanced Large-Scale Video Generative Model
Native and Compact Structured Latents for 3D Generation
The most powerful local music generation model
Fast and memory-efficient exact attention
An enhanced tool for CodexApp, striving to make Codex better to use
Robust Speech Recognition via Large-Scale Weak Supervision
Powerful AI language model (MoE) optimized for efficiency/performance
Awesome multilingual OCR toolkits based on PaddlePaddle
AI Fully Automated Short Video Engine
Official inference repo for FLUX.1 models
Improve your Baduk skills by training with KataGo
Generate audiobooks from e-books
An open source implementation of CLIP
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Effortless data labeling with AI support from Segment Anything
OCR software, free and offline
OBLITERATE THE CHAINS THAT BIND YOU
NVR with realtime local object detection for IP cameras
A Lightweight Face Recognition and Facial Attribute Analysis
Python tool for converting files and office documents to Markdown
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Fast stable diffusion on CPU and AI PC
Code for running inference and finetuning with SAM 3 model
1 min voice data can also be used to train a good TTS model
Faster Whisper transcription with CTranslate2