Image generation model with single-stream diffusion transformer
3D reconstruction software
OCRmyPDF adds an OCR text layer to scanned PDF files
The all-in-one Desktop & Docker AI application with full RAG and AI
Agentic, Reasoning, and Coding (ARC) foundation models
Open source machine learning framework
Code for running inference and finetuning with SAM 3 model
Open-source, high-performance AI model with advanced reasoning
Kimi K2 is the large language model series developed by Moonshot AI
Captcha solver extension for humans
Structure-from-Motion and Multi-View Stereo
Speech-to-text, text-to-speech, and speaker recognition
Robust Speech Recognition via Large-Scale Weak Supervision
Awesome multilingual OCR toolkits based on PaddlePaddle
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Qwen3 is the large language model series developed by Qwen team
A simple, high-quality voice conversion tool focused on ease of use
User-friendly AI Interface
1 min voice data can also be used to train a good TTS model
Open-source vector similarity search for Postgres
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Lightweight coding agent that runs in your terminal
Download media files from a telegram conversation/chat/channel
NVR with realtime local object detection for IP cameras
Python inference and LoRA trainer package for the LTX-2 audio–video