Port of Facebook's LLaMA model in C/C++
Tongyi Deep Research, the Leading Open-source Deep Research Agent
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Agentic, Reasoning, and Coding (ARC) foundation models
Revolutionizing Database Interactions with Private LLM Technology
Python bindings for llama.cpp
Qwen3 is the large language model series developed by Qwen team
Powerful AI language model (MoE) optimized for efficiency/performance
Phi-3.5 for Mac: Locally-run Vision and Language Models
A fast, local neural text to speech system
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Open-source, high-performance AI model with advanced reasoning
RGBD video generation model conditioned on camera input
ChatGLM-6B: An Open Bilingual Dialogue Language Model
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference
Image generation model with single-stream diffusion transformer
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Contexts Optical Compression
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
State-of-the-art TTS model under 25MB
Qwen-Image is a powerful image generation foundation model
Official inference repo for FLUX.2 models
The official repo of Qwen chat & pretrained large language model
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model