1 min voice data can also be used to train a good TTS model
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
NVR with realtime local object detection for IP cameras
OCR software, free and offline
Synchronized Translation for Videos
A simple, high-quality voice conversion tool focused on ease of use
A Lightweight Face Recognition and Facial Attribute Analysis
Generate short videos with one click using AI LLM
Use Microsoft Edge's online text-to-speech service from Python
Open-Sora: Democratizing Efficient Video Production for All
RGBD video generation model conditioned on camera input
A Python wrapper you can't refuse
Chemcrow
Models for object and human mesh reconstruction
A gradio web UI for running Large Language Models like LLaMA
Generate audiobooks from e-books, voice cloning & 1107+ languages
From Images to High-Fidelity 3D Assets
Qwen3-Coder is the code version of Qwen3
A command-line productivity tool powered by AI large language models
text and image to video generation: CogVideoX (2024) and CogVideo
Powerful tool that lets you create and run intelligent agents
Reference PyTorch implementation and models for DINOv3
Machine learning in Python
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
An experimental version of DeepSeek model