Port of Facebook's LLaMA model in C/C++
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Python bindings for llama.cpp
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Agentic, Reasoning, and Coding (ARC) foundation models
Revolutionizing Database Interactions with Private LLM Technology
Qwen3 is the large language model series developed by Qwen team
Powerful AI language model (MoE) optimized for efficiency/performance
Phi-3.5 for Mac: Locally-run Vision and Language Models
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Wan2.2: Open and Advanced Large-Scale Video Generative Model
RGBD video generation model conditioned on camera input
Open-source, high-performance AI model with advanced reasoning
ChatGLM-6B: An Open Bilingual Dialogue Language Model
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Qwen-Image is a powerful image generation foundation model
Contexts Optical Compression
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Image generation model with single-stream diffusion transformer
A Customizable Image-to-Video Model based on HunyuanVideo
State-of-the-art TTS model under 25MB
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Official inference repo for FLUX.2 models
Qwen3-Coder is the code version of Qwen3