Learn how to develop, deploy and iterate on production-grade ML
AI Agent Evaluator & Red Team Platform
Vertically Unified Agents for Graph Retrieval-Augmented Reasoning
A large-scale model of medical consultation in Chinese
LongBench v2 and LongBench (ACL 25'&24')
On the Structural Pruning of Large Language Models
SQL-Driven RAG Engine
Streamlines and simplifies prompt design for both developers
AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim
MemoryOS is designed to provide a memory operating system
Benchmark LLMs by fighting in Street Fighter 3
Cache-Augmented Generation: A Simple, Efficient Alternative to RAG
Recipes to train reward model for RLHF
An LLM Compiler for Parallel Function Calling
Autoregressive Model Beats Diffusion
An agentless approach to automatically solve software development
Neural Network architecture based on ideas of the original LSTM
LISA: Reasoning Segmentation via Large Language Model
Build multimodal language agents for fast prototype and production
Leaderboard Comparing LLM Performance at Producing Hallucinations
Skywork-R1V is an advanced multimodal AI model series
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
High-performance inference framework for large language models
Code and models for ICML 2024 paper, NExT-GPT
Run PyTorch LLMs locally on servers, desktop and mobile