GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multi-modal large language model designed for audio understanding
Qwen3-omni is a natively end-to-end, omni-modal LLM
Free OCR Software: No internet required, easy to use.
A library for audio and music analysis, feature extraction
fast C++ library for linear algebra & scientific computing
Visual Automation IDE — automate anything you see on screen
Download, save and convert multiple subtitles from YouTube videos
Transform your voice in real-time voxal voice changer
Mice speech to text with MX Cinnamon OS ISO
Graphical User Interface Face Anonymization Tool
Chat & pretrained large vision language model
screen recognition and search
Local AI file organization with categorization and rename suggestions
Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant
Blazeface is a lightweight model that detects faces in images
Drop In the Bucket Neural Networks
Detect faces in an image
Run GGUF models easily with a UI or API. One File. Zero Install.
mice stt tts
fast C++ library for GPU linear algebra & scientific computing
A Python application to add watermarks (text or image) to PDF files
LightWeight OCR
VoiceClip es una aplicación de asistencia a usuarios