Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, Azure OpenAI etc.] liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response. Streaming is supported for OpenAI, Azure, Anthropic, and Huggingface models.
A blazing fast AI Gateway with integrated guardrails
Portkey AI Gateway aims to offer a blazing fast, secure, and flexible gateway for interacting with a wide variety of models and enforcing guardrails. It presents a single, friendly API through which you can route to 200+ LLMs, while applying configurable input/output guardrails to enforce policies or restrict certain content. It supports automatic retries, fallbacks, load balancing across providers or keys, and request timeouts to avoid latency spikes. The gateway is multimodal: it can...