The most powerful and modular diffusion model GUI, api and backend
OCRmyPDF adds an OCR text layer to scanned PDF files
Open source machine learning framework
Agentic, Reasoning, and Coding (ARC) foundation models
3D reconstruction software
The all-in-one Desktop & Docker AI application with full RAG and AI
Kimi K2 is the large language model series developed by Moonshot AI
Code for running inference and finetuning with SAM 3 model
Open-source, high-performance AI model with advanced reasoning
Awesome multilingual OCR toolkits based on PaddlePaddle
Qwen3 is the large language model series developed by Qwen team
Speech-to-text, text-to-speech, and speaker recognition
Captcha solver extension for humans
1 min voice data can also be used to train a good TTS model
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Lightweight coding agent that runs in your terminal
Powerful AI language model (MoE) optimized for efficiency/performance
User-friendly AI Interface
NVR with realtime local object detection for IP cameras
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Open-source vector similarity search for Postgres
Robust Speech Recognition via Large-Scale Weak Supervision
RGBD video generation model conditioned on camera input
A simple, high-quality voice conversion tool focused on ease of use
Prompt, run, edit, & deploy full-stack web applications using any LLM