157 models, 30 providers, one command to find what runs on hardware
Fast and efficient unstructured data extraction
Instant, controllable, local pre-trained AI models in Rust
Fast, local-first web content extraction for LLMs
CLI proxy that reduces LLM token consumption
Fast, flexible LLM inference
Open-source LLM load balancer and serving platform for hosting LLMs
A high-performance inference engine for AI models