An AI-powered security review GitHub Action using Claude
Official implementation of Watermark Anything with Localized Messages
Easy Docker setup for Stable Diffusion with user-friendly UI
Robust Speech Recognition Across Languages, Dialects
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-Fidelity and Controllable Generation of Textured 3D Assets
Large Multimodal Models for Video Understanding and Editing
Open-Source Financial Large Language Models
Qwen3-omni is a natively end-to-end, omni-modal LLM
Tiny vision language model
Generating Immersive, Explorable, and Interactive 3D Worlds
Inference code for scalable emulation of protein equilibrium ensembles
Capable of understanding text, audio, vision, video
Foundation Models for Time Series
Hackable and optimized Transformers building blocks
A Customizable Image-to-Video Model based on HunyuanVideo
A Unified Framework for Text-to-3D and Image-to-3D Generation
Renderer for the harmony response format to be used with gpt-oss
Project Lyra: Open Generative 3D World Models
Ultra-Efficient LLMs on End Device
Open-source deep-learning framework
An Efficient Agentic Model for Computer Use
A trainable PyTorch reproduction of AlphaFold 3
DeepMind model for tracking arbitrary points across videos & robotics
Open Source Speech Language Model