Operating LLMs in production
Deploy your agentic worfklows to production
Replace OpenAI GPT with another LLM in your app
Low-latency REST API for serving text-embeddings
A guidance language for controlling large language models
Cybersecurity AI (CAI), the framework for AI Security
Performance-optimized AI inference on your GPUs
Production-grade platform for building agentic IM bots
ChatGLM2-6B: An Open Bilingual Chat LLM
On the Structural Pruning of Large Language Models
A simple, performant and scalable Jax LLM
A lightweight framework for building LLM-based agents
A new open-source framework to build and deploy intelligent agents
High-performance inference framework for large language models
High-performance Inference and Deployment Toolkit for LLMs and VLMs
Implement CPU from scratch and play with large model deployments
Generative AI reference workflows
LLM training code for MosaicML foundation models
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Repo for YaYi Chinese LLMs based on LlaMA2 & BLOOM
Ray Aviary - evaluate multiple LLMs easily
Langchain Apps on Production with Jina & FastAPI
Chinese LLaMA & Alpaca large language model + local CPU/GPU training