Data Infrastructure providing an approach to multimodal AI workloads
Build multimodal language agents for fast prototype and production
OCR expert VLM powered by Hunyuan's native multimodal architecture
A Pioneering Open-Source Alternative to GPT-4o
Extract audio and video content and organize it into a Markdown note
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Easy to use Python library for creating 2D arcade games
AI Slack bot for reading, summarizing, and chatting with content
The data structure for multimodal data
Dealing with all unstructured data, such as reverse image search
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
InvokeAI is a leading creative engine for Stable Diffusion models
Build AI-powered semantic search applications
Benchmark LLMs by fighting in Street Fighter 3
Build cross-modal and multimodal applications on the cloud
SoundTranscriber can be used to generate automatic transcription / aut
AI-powered tool to quickly remove watermarks from videos flawlessly
Overcoming Data Limitations for High-Quality Video Diffusion Models
xSTUDIO is a high performance playback and review tool.
myplayer Free Karaoke & Media Player Software (Myanmar)
Uility to make home movies from your digital camera files
My MSX programs and some additional .cas tools
Ainee - AI Notetaking and Learning Companion
An advanced file manager with qss themes and iso and folder previews
CLIP + FFT/DWT/RGB = text to image/video