The most intuitive, flexible, way for researchers to build models
Build AI-powered semantic search applications
Large Multimodal Models for Video Understanding and Editing
AIMET is a library that provides advanced quantization and compression
Experimental, AI/ML-powered and open sourced Marketing Mix Modeling
Real-time voice interactive digital human
Concatenate a directory full of files into a single prompt
GUI Exploration Lab. One of the best GUI agent solutions
Qwen3-omni is a natively end-to-end, omni-modal LLM
Open-source infrastructure for Computer-Use Agents. Sandboxes
Flexible Photo Recrafting While Preserving Your Identity
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Collection of reference environments, offline reinforcement learning
Simple and easily configurable grid world environments
Diversity-driven optimization and large-model reasoning ability
Deploy and share agents with open infrastructure
LLM training in simple, raw C/CUDA
Less Code, Lower Barrier, Faster Deployment
A simple, secure MCP-to-OpenAPI proxy server
The most powerful Android RPA agent framework
A fast, powerful, and simple hierarchical vision transformer
Code release for Cut and Learn for Unsupervised Object Detection
Official implementation of Watermark Anything with Localized Messages
High-resolution models for human tasks
A state-of-the-art open visual language model