Desktop app for prototyping and debugging LangGraph applications
Framework to build resilient language agents as graphs
The all-in-one Desktop & Docker AI application with full RAG and AI
A high-throughput and memory-efficient inference and serving engine
GLM-5: From Vibe Coding to Agentic Engineering
Multilingual sentence & image embeddings with BERT
State-of-the-art Parameter-Efficient Fine-Tuning
State of the art LLM and coding model
Modular AI runtime for robots
New set of lightweight state-of-the-art, open foundation models
AI Coding agent for the terminal
Operating LLMs in production
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Bringing large-language models and chat to web browsers
Qwen3-Coder is the code version of Qwen3
⚡ Building applications with LLMs through composability ⚡
TT-NN operator library, and TT-Metalium low level kernel programming
Replace OpenAI GPT with another LLM in your app
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
Zep: A long-term memory store for LLM / Chatbot applications
MiniMax M2.1, a SOTA model for real-world dev & agents.
A series of math-specific large language models of our Qwen2 series
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Text-space optimizer that trains reusable natural-language skills
Gemma open-weight LLM library, from Google DeepMind