Showing 12 open source projects for "request system"

View related business solutions
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    SuggestArr

    SuggestArr

    Request recommended movies, TV shows and anime to Jellyseer/Overseer

    SuggestArr is an open-source automation platform designed to recommend and automatically request movies, TV shows, and anime based on a user’s viewing history in self-hosted media servers. The project integrates with popular media management systems such as Jellyfin, Plex, and Emby, allowing it to analyze recently watched content and identify similar titles using metadata from the TMDb database. Once potential recommendations are identified, SuggestArr can automatically send download or request instructions to services like Jellyseer or Overseerr, which then coordinate with media download tools and libraries. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    vLLM Semantic Router

    vLLM Semantic Router

    System Level Intelligent Router for Mixture-of-Models at Cloud

    Semantic Router is an open-source system designed to intelligently route requests across multiple large language models based on the semantic meaning and complexity of user queries. Instead of sending every prompt to the same model, the system analyzes the intent and reasoning requirements of the request and dynamically selects the most appropriate model to process it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Paddler

    Paddler

    Open-source LLM load balancer and serving platform for hosting LLMs

    ...The architecture is designed with privacy and cost control in mind, making it suitable for organizations that handle sensitive data or require predictable operational costs. Paddler also includes tools for monitoring, request buffering, and autoscaling integration so that deployments can adapt dynamically to changing workloads. A built-in administrative interface allows developers and operations teams to manage models, observe system performance, and test inference endpoints.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Envoy AI Gateway

    Envoy AI Gateway

    Manages Unified Access to Generative AI Services

    ...The gateway provides policy enforcement, observability, and routing capabilities that are specifically designed for AI inference workloads, including intelligent endpoint selection and request optimization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    Parallax

    Parallax

    Parallax is a distributed model serving framework

    Parallax is a decentralized inference framework designed to run large language models across distributed computing resources. Instead of relying on centralized GPU clusters in data centers, the system allows multiple heterogeneous machines to collaborate in serving AI inference workloads. Parallax divides model layers across different nodes and dynamically coordinates them to form a complete inference pipeline. A two-stage scheduling architecture determines how model layers are allocated to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    AxonHub

    AxonHub

    Use any SDK to call 100+ LLMs

    ...The system also provides infrastructure features such as request routing, failover mechanisms, load balancing, and cost management for AI applications. This architecture makes it easier to experiment with multiple models and manage production deployments that rely on several providers simultaneously.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    tiny-llm

    tiny-llm

    A course of learning LLM inference serving on Apple Silicon

    tiny-llm is an educational open-source project designed to teach system engineers how large language model inference and serving systems work by building them from scratch. The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Integuru v0

    Integuru v0

    The first AI agent that builds permissionless integrations

    Integuru is an open-source AI agent designed to automatically create integrations between software platforms by reverse-engineering their internal APIs. Instead of relying on official developer documentation or publicly available APIs, the system analyzes network traffic generated by user interactions within a web application. Developers capture browser requests and authentication data, which the agent then uses to infer the structure of the platform’s internal API endpoints. Based on this...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Banana Slides

    Banana Slides

    A native AI PPT generation application based on nano banana pro

    ...Built on top of the Nano Banana Pro framework, the software enables users to transform simple prompts or outlines into complete slide decks without manually formatting content. Instead of relying on traditional slide editing workflows, the system allows users to describe the desired presentation in natural language and have the AI generate structured slides, including titles, bullet points, and layout suggestions. The tool also supports iterative refinement through conversational commands, allowing users to modify individual slides or request stylistic changes without directly editing the presentation file. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 10
    Shell-AI

    Shell-AI

    LangChain powered shell command generator and runner CLI

    Shell-AI is an open-source command-line interface utility that allows users to generate and execute shell commands using natural language prompts. Instead of requiring users to remember complex command syntax, the tool lets them describe their intent in plain English and automatically suggests commands that accomplish the task. The system is powered by large language models and integrates with frameworks such as LangChain to interpret user requests and translate them into executable shell...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Chat UI

    Chat UI

    The open source codebase powering HuggingChat

    Hugging Face Chat UI is an open-source web interface designed for interacting with large language models through a modern conversational interface. The project serves as the codebase behind HuggingChat and can be deployed locally or on cloud infrastructure to create customizable AI chat applications. Built with modern web technologies such as SvelteKit and backed by MongoDB for persistence, the interface provides a responsive environment for multi-turn conversations, file handling, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Punica

    Punica

    Serving multiple LoRA finetuned LLM as one

    Punica is a system designed to efficiently serve multiple LoRA-fine-tuned large language models within a shared GPU environment. LoRA is a parameter-efficient fine-tuning method that allows developers to adapt large pretrained models to specific tasks by adding lightweight adapter layers rather than retraining the entire model. Punica introduces a serving architecture that allows multiple LoRA adapters to share the same base model during inference, significantly reducing memory consumption...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB