All-in-one WebUI for AI generative image and video creation
Capable of understanding text, audio, vision, video
GPT4V-level open-source multi-modal model based on Llama3-8B
Qwen3-omni is a natively end-to-end, omni-modal LLM
A Pioneering Open-Source Alternative to GPT-4o
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Data Lake for Deep Learning. Build, manage, and query datasets