4M: Massively Multimodal Masked Modeling
A Customizable Image-to-Video Model based on HunyuanVideo
Agent S: an open agentic framework that uses computers like a human
Collections of robotics environments
kaldi-asr/kaldi is the official location of the Kaldi project
AI Toolkit for Healthcare Imaging
Qwen3-omni is a natively end-to-end, omni-modal LLM
Data Science Guide With Videos And Materials
The data structure for multimodal data
Cross-platform C++ library for use as a default application framework.
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open-weight, large-scale hybrid-attention reasoning model
Large-scale Self-supervised Pre-training Across Tasks, Languages, etc.
A Linux system for modern computers on an immutable foundation.
A Pioneering Open-Source Alternative to GPT-4o
Towards Real-World Vision-Language Understanding
Planung & Simulation kleiner PV-Anlagen
Production-ready data processing made easy and shareable
An Arch Linux OS with 20+ custom GUI utilities & MLP theme customizer.
Open-source behavioral intelligence platform for detecting child groom
Release for Improved Denoising Diffusion Probabilistic Models
Images to inference with no labeling
Get a ChatGPT plugin up and running in under 5 minutes
Official release of InternLM series