Token-Oriented Object Notation (TOON)
CLI proxy that reduces LLM token consumption
Powerful AI language model (MoE) optimized for efficiency/performance
Easy token price estimates for 400+ LLMs. TokenOps
TONL (Token-Optimized Notation Language)
Open-source, high-performance AI model with advanced reasoning
Fast, local-first web content extraction for LLMs
The production toolkit for LLMs. Observability, prompt management
Minimal reproduction of OneRec
Real-time multi-AI collaboration: Claude, Codex & Gemini
Learn How LLM Transformer Models Work with Interactive Visualization
MoBA: Mixture of Block Attention for Long-Context LLMs
User toolkit for analyzing and interfacing with Large Language Models
An efficient forwarding service designed for LLMs
dLLM: Simple Diffusion Language Modeling
Completely free, private, UI based Tech Documentation MCP server
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Large-language-model & vision-language-model based on Linear Attention
Agentic, Reasoning, and Coding (ARC) foundation models
Open-weight, large-scale hybrid-attention reasoning model
Qwen3 is the large language model series developed by Qwen team
Uncertainty Quantification for Language Models, is a Python package
Performance-optimized AI inference on your GPUs
A Telegram bot for Large Language Models
High-performance Inference and Deployment Toolkit for LLMs and VLMs