An efficient forwarding service designed for LLMs
Test-Time Reinforcement Learning
A simple yet powerful agent framework that delivers with models
Bridging LLM and Recommender System
Unify Efficient Fine-tuning of RAG Retrieval, including Embedding
Semi-Structured Agentic Framework. Workflows build themselves
Minimal reproduction of OneRec
A powerful tool for automated LLM fuzzing
AI-powered tool for efficient abstract and PDF screening
A high-quality PDF to Markdown tool based on large language model
Specify a github or local repo, github pull request
Easy token price estimates for 400+ LLMs. TokenOps
Deploy your agentic worfklows to production
MoBA: Mixture of Block Attention for Long-Context LLMs
the terminal client for Ollama
Modular AI runtime for robots
NeurIPS2025 Spotlight] Quantized Attention
Maimaibot, a (more focused) multi-platform intelligent agent
The first AI agent that builds permissionless integrations
A python module to repair invalid JSON from LLMs
Weaving the Digital Agent Galaxy
AirLLM 70B inference with single 4GB GPU
Unified framework for building enterprise RAG pipelines
Production-grade platform for building agentic IM bots
One-stop solution for creating your digital avatar from chat history