Desktop app for prototyping and debugging LangGraph applications
Framework to build resilient language agents as graphs
The all-in-one Desktop & Docker AI application with full RAG and AI
A high-throughput and memory-efficient inference and serving engine
GLM-5: From Vibe Coding to Agentic Engineering
Multilingual sentence & image embeddings with BERT
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
State-of-the-art Parameter-Efficient Fine-Tuning
State of the art LLM and coding model
Modular AI runtime for robots
New set of lightweight state-of-the-art, open foundation models
MobileLLM Optimizing Sub-billion Parameter Language Models
Bringing large-language models and chat to web browsers
AI Coding agent for the terminal
Operating LLMs in production
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Qwen3-Coder is the code version of Qwen3
⚡ Building applications with LLMs through composability ⚡
TT-NN operator library, and TT-Metalium low level kernel programming
Gemma open-weight LLM library, from Google DeepMind
Replace OpenAI GPT with another LLM in your app
Text-space optimizer that trains reusable natural-language skills
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
Zep: A long-term memory store for LLM / Chatbot applications
MiniMax M2.1, a SOTA model for real-world dev & agents.