Bidirectional token-classification model for identifiable info
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen3-omni is a natively end-to-end, omni-modal LLM
Collection of Gemma 3 variants that are trained for performance
General-purpose image editing model that delivers high-fidelity
Fast-stable-diffusion + DreamBooth
Open-source deep-learning framework
HY-Motion model for 3D character animation generation
CogView4, CogView3-Plus and CogView3(ECCV 2024)
New family of code large language models (LLMs)
Revolutionizing Database Interactions with Private LLM Technology
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Open-source framework for intelligent speech interaction
Sharp Monocular Metric Depth in Less Than a Second
Provides convenient access to the Anthropic REST API from any Python 3
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Diversity-driven optimization and large-model reasoning ability
Robust Speech Recognition Across Languages, Dialects
Pretrained time-series foundation model developed by Google Research
PyTorch code and models for the DINOv2 self-supervised learning
Implementation of the Surya Foundation Model for Heliophysics
Inference script for Oasis 500M
Open-source large language model family from Tencent Hunyuan