Structured outputs for llms
Python bindings for llama.cpp
Run Local LLMs on Any Device. Open-source
Port of Facebook's LLaMA model in C/C++
A high-throughput and memory-efficient inference and serving engine
Advanced language and coding AI model
Agentic, Reasoning, and Coding (ARC) foundation models
Qwen3 is the large language model series developed by Qwen team
Low-code app builder for RAG and multi-agent AI applications
Powerful AI language model (MoE) optimized for efficiency/performance
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Open-source, high-performance AI model with advanced reasoning
LLM
ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
lightweight package to simplify LLM API calls
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Interact with your documents using the power of GPT
An elegent pytorch implement of transformers
Easy-to-use LLM fine-tuning framework (LLaMA-2, BLOOM, Falcon
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible
Operating LLMs in production
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
ChatGLM2-6B: An Open Bilingual Chat LLM
Qwen3-Coder is the code version of Qwen3