Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Designed for text embedding and ranking tasks
Inference framework for 1-bit LLMs
Ling is a MoE LLM provided and open-sourced by InclusionAI
Open-source deep-learning framework
Phi-3.5 for Mac: Locally-run Vision and Language Models
Open-source framework for intelligent speech interaction
Chat & pretrained large vision language model
Diversity-driven optimization and large-model reasoning ability
Pokee Deep Research Model Open Source Repo
Implementation of the Surya Foundation Model for Heliophysics
Chat & pretrained large audio language model proposed by Alibaba Cloud
Long-form streaming TTS system for multi-speaker dialogue generation
Block Diffusion for Ultra-Fast Speculative Decoding
OCR expert VLM powered by Hunyuan's native multimodal architecture
Capable of understanding text, audio, vision, video
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
A trainable PyTorch reproduction of AlphaFold 3
High-Fidelity and Controllable Generation of Textured 3D Assets
Multi-modal large language model designed for audio understanding
The official PyTorch implementation of Google's Gemma models
General-purpose image editing model that delivers high-fidelity
LLM-based Reinforcement Learning audio edit model
Multimodal embedding and reranking models built on Qwen3-VL