Uncover insights, surface problems, monitor, and fine tune your LLM
Create HTML profiling reports from pandas DataFrame objects
A high-throughput and memory-efficient inference and serving engine
Run serverless GPU workloads with fast cold starts on bare-metal
Standardized Serverless ML Inference Platform on Kubernetes
The free, Open Source alternative to OpenAI, Claude and others
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
A RWKV management and startup tool, full automation, only 8MB
Integrate, train and manage any AI models and APIs with your database
Replace OpenAI GPT with another LLM in your app
Deep Learning API and Server in C++14 support for Caffe, PyTorch
Build Production-ready Agentic Workflow with Natural Language
Openai style api for open large language models
GPU environment management and cluster orchestration
LLMs and Machine Learning done easily
Serve machine learning models within a Docker container