11 projects for "request system" with 2 filters applied:

  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    SuggestArr

    SuggestArr

    Request recommended movies, TV shows and anime to Jellyseer/Overseer

    SuggestArr is an open-source automation platform designed to recommend and automatically request movies, TV shows, and anime based on a user’s viewing history in self-hosted media servers. The project integrates with popular media management systems such as Jellyfin, Plex, and Emby, allowing it to analyze recently watched content and identify similar titles using metadata from the TMDb database. Once potential recommendations are identified, SuggestArr can automatically send download or request instructions to services like Jellyseer or Overseerr, which then coordinate with media download tools and libraries. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    vLLM Semantic Router

    vLLM Semantic Router

    System Level Intelligent Router for Mixture-of-Models at Cloud

    Semantic Router is an open-source system designed to intelligently route requests across multiple large language models based on the semantic meaning and complexity of user queries. Instead of sending every prompt to the same model, the system analyzes the intent and reasoning requirements of the request and dynamically selects the most appropriate model to process it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Paddler

    Paddler

    Open-source LLM load balancer and serving platform for hosting LLMs

    ...The architecture is designed with privacy and cost control in mind, making it suitable for organizations that handle sensitive data or require predictable operational costs. Paddler also includes tools for monitoring, request buffering, and autoscaling integration so that deployments can adapt dynamically to changing workloads. A built-in administrative interface allows developers and operations teams to manage models, observe system performance, and test inference endpoints.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    AxonHub

    AxonHub

    Use any SDK to call 100+ LLMs

    ...The system also provides infrastructure features such as request routing, failover mechanisms, load balancing, and cost management for AI applications. This architecture makes it easier to experiment with multiple models and manage production deployments that rely on several providers simultaneously.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Envoy AI Gateway

    Envoy AI Gateway

    Manages Unified Access to Generative AI Services

    ...The gateway provides policy enforcement, observability, and routing capabilities that are specifically designed for AI inference workloads, including intelligent endpoint selection and request optimization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Parallax

    Parallax

    Parallax is a distributed model serving framework

    Parallax is a decentralized inference framework designed to run large language models across distributed computing resources. Instead of relying on centralized GPU clusters in data centers, the system allows multiple heterogeneous machines to collaborate in serving AI inference workloads. Parallax divides model layers across different nodes and dynamically coordinates them to form a complete inference pipeline. A two-stage scheduling architecture determines how model layers are allocated to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Integuru v0

    Integuru v0

    The first AI agent that builds permissionless integrations

    Integuru is an open-source AI agent designed to automatically create integrations between software platforms by reverse-engineering their internal APIs. Instead of relying on official developer documentation or publicly available APIs, the system analyzes network traffic generated by user interactions within a web application. Developers capture browser requests and authentication data, which the agent then uses to infer the structure of the platform’s internal API endpoints. Based on this...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    tiny-llm

    tiny-llm

    A course of learning LLM inference serving on Apple Silicon

    tiny-llm is an educational open-source project designed to teach system engineers how large language model inference and serving systems work by building them from scratch. The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Shell-AI

    Shell-AI

    LangChain powered shell command generator and runner CLI

    Shell-AI is an open-source command-line interface utility that allows users to generate and execute shell commands using natural language prompts. Instead of requiring users to remember complex command syntax, the tool lets them describe their intent in plain English and automatically suggests commands that accomplish the task. The system is powered by large language models and integrates with frameworks such as LangChain to interpret user requests and translate them into executable shell...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    Chat UI

    Chat UI

    The open source codebase powering HuggingChat

    Hugging Face Chat UI is an open-source web interface designed for interacting with large language models through a modern conversational interface. The project serves as the codebase behind HuggingChat and can be deployed locally or on cloud infrastructure to create customizable AI chat applications. Built with modern web technologies such as SvelteKit and backed by MongoDB for persistence, the interface provides a responsive environment for multi-turn conversations, file handling, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Punica

    Punica

    Serving multiple LoRA finetuned LLM as one

    Punica is a system designed to efficiently serve multiple LoRA-fine-tuned large language models within a shared GPU environment. LoRA is a parameter-efficient fine-tuning method that allows developers to adapt large pretrained models to specific tasks by adding lightweight adapter layers rather than retraining the entire model. Punica introduces a serving architecture that allows multiple LoRA adapters to share the same base model during inference, significantly reducing memory consumption...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB