request system free download

SuggestArr

Request recommended movies, TV shows and anime to Jellyseer/Overseer

SuggestArr is an open-source automation platform designed to recommend and automatically request movies, TV shows, and anime based on a user’s viewing history in self-hosted media servers. The project integrates with popular media management systems such as Jellyfin, Plex, and Emby, allowing it to analyze recently watched content and identify similar titles using metadata from the TMDb database. Once potential recommendations are identified, SuggestArr can automatically send download or request instructions to services like Jellyseer or Overseerr, which then coordinate with media download tools and libraries. ...

Downloads: 0 This Week

Last Update: 4 days ago

See Project

vLLM Semantic Router

System Level Intelligent Router for Mixture-of-Models at Cloud

Semantic Router is an open-source system designed to intelligently route requests across multiple large language models based on the semantic meaning and complexity of user queries. Instead of sending every prompt to the same model, the system analyzes the intent and reasoning requirements of the request and dynamically selects the most appropriate model to process it.

Downloads: 0 This Week

Last Update: 2026-03-10

See Project

Paddler

Open-source LLM load balancer and serving platform for hosting LLMs

...The architecture is designed with privacy and cost control in mind, making it suitable for organizations that handle sensitive data or require predictable operational costs. Paddler also includes tools for monitoring, request buffering, and autoscaling integration so that deployments can adapt dynamically to changing workloads. A built-in administrative interface allows developers and operations teams to manage models, observe system performance, and test inference endpoints.

Downloads: 0 This Week

Last Update: 2026-04-30

See Project

AxonHub

Use any SDK to call 100+ LLMs

...The system also provides infrastructure features such as request routing, failover mechanisms, load balancing, and cost management for AI applications. This architecture makes it easier to experiment with multiple models and manage production deployments that rely on several providers simultaneously.

Downloads: 5 This Week

Last Update: 6 days ago

See Project

Envoy AI Gateway

Manages Unified Access to Generative AI Services

...The gateway provides policy enforcement, observability, and routing capabilities that are specifically designed for AI inference workloads, including intelligent endpoint selection and request optimization.

Downloads: 0 This Week

Last Update: 2026-05-05

See Project

Parallax

Parallax is a distributed model serving framework

Parallax is a decentralized inference framework designed to run large language models across distributed computing resources. Instead of relying on centralized GPU clusters in data centers, the system allows multiple heterogeneous machines to collaborate in serving AI inference workloads. Parallax divides model layers across different nodes and dynamically coordinates them to form a complete inference pipeline. A two-stage scheduling architecture determines how model layers are allocated to...

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

Integuru v0

The first AI agent that builds permissionless integrations

Integuru is an open-source AI agent designed to automatically create integrations between software platforms by reverse-engineering their internal APIs. Instead of relying on official developer documentation or publicly available APIs, the system analyzes network traffic generated by user interactions within a web application. Developers capture browser requests and authentication data, which the agent then uses to infer the structure of the platform’s internal API endpoints. Based on this...

Downloads: 0 This Week

Last Update: 2026-04-14

See Project

tiny-llm

A course of learning LLM inference serving on Apple Silicon

tiny-llm is an educational open-source project designed to teach system engineers how large language model inference and serving systems work by building them from scratch. The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques.

Downloads: 3 This Week

Last Update: 2026-05-21

See Project

Shell-AI

LangChain powered shell command generator and runner CLI

Shell-AI is an open-source command-line interface utility that allows users to generate and execute shell commands using natural language prompts. Instead of requiring users to remember complex command syntax, the tool lets them describe their intent in plain English and automatically suggests commands that accomplish the task. The system is powered by large language models and integrates with frameworks such as LangChain to interpret user requests and translate them into executable shell...

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

Chat UI

The open source codebase powering HuggingChat

Hugging Face Chat UI is an open-source web interface designed for interacting with large language models through a modern conversational interface. The project serves as the codebase behind HuggingChat and can be deployed locally or on cloud infrastructure to create customizable AI chat applications. Built with modern web technologies such as SvelteKit and backed by MongoDB for persistence, the interface provides a responsive environment for multi-turn conversations, file handling, and...

Downloads: 0 This Week

Last Update: 2026-05-11

See Project

Punica

Serving multiple LoRA finetuned LLM as one

Punica is a system designed to efficiently serve multiple LoRA-fine-tuned large language models within a shared GPU environment. LoRA is a parameter-efficient fine-tuning method that allows developers to adapt large pretrained models to specific tasks by adding lightweight adapter layers rather than retraining the entire model. Punica introduces a serving architecture that allows multiple LoRA adapters to share the same base model during inference, significantly reducing memory consumption...

Downloads: 0 This Week

Last Update: 2026-03-09

See Project

Search Results for "request system"

11 projects for "request system" with 2 filters applied:

SuggestArr

vLLM Semantic Router

Paddler

AxonHub

Envoy AI Gateway

Parallax

Integuru v0

tiny-llm

Shell-AI

Chat UI

Punica

Search Results for "request system"

11 projects for "request system" with 2 filters applied:

SuggestArr

vLLM Semantic Router

Paddler

AxonHub

Envoy AI Gateway

Parallax

Integuru v0

tiny-llm

Shell-AI

Chat UI

Punica

Related Categories