Build multimodal language agents for fast prototype and production
OCR expert VLM powered by Hunyuan's native multimodal architecture
A Pioneering Open-Source Alternative to GPT-4o
Extract audio and video content and organize it into a Markdown note
Easy to use Python library for creating 2D arcade games
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
The data structure for multimodal data
AI Slack bot for reading, summarizing, and chatting with content
Dealing with all unstructured data, such as reverse image search
21 Lessons, Get Started Building with Generative AI
Build AI-powered semantic search applications
InvokeAI is a leading creative engine for Stable Diffusion models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Benchmark LLMs by fighting in Street Fighter 3
Build cross-modal and multimodal applications on the cloud
AI-powered tool to quickly remove watermarks from videos flawlessly
SoundTranscriber can be used to generate automatic transcription / aut
Overcoming Data Limitations for High-Quality Video Diffusion Models
xSTUDIO is a high performance playback and review tool.
myplayer Free Karaoke & Media Player Software (Myanmar)
Uility to make home movies from your digital camera files
My MSX programs and some additional .cas tools
Ainee - AI Notetaking and Learning Companion
An advanced file manager with qss themes and iso and folder previews
CLIP + FFT/DWT/RGB = text to image/video