Showing 12 open source projects for "linux pxe server"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    LoRAX

    LoRAX

    Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

    Lorax is a multi-LoRA (Low-Rank Adaptation) inference server that scales to thousands of fine-tuned Large Language Models (LLMs). It enables efficient deployment and management of numerous fine-tuned models, facilitating scalable AI applications. Lorax is designed to handle high concurrency and provides a robust infrastructure for serving multiple LLMs simultaneously.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    LazyLLM

    LazyLLM

    Easiest and laziest way for building multi-agent LLMs applications

    LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    KServe

    KServe

    Standardized Serverless ML Inference Platform on Kubernetes

    KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks. It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX. It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    API-for-Open-LLM

    API-for-Open-LLM

    Openai style api for open large language models

    API-for-Open-LLM is a lightweight API server designed for deploying and serving open large language models (LLMs), offering a simple way to integrate LLMs into applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Text Generation Inference

    Text Generation Inference

    Large Language Model Text Generation Inference

    Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Infinity

    Infinity

    Low-latency REST API for serving text-embeddings

    Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting all sentence-transformer models and frameworks. Infinity is developed under MIT License. Infinity powers inference behind Gradient.ai and other Embedding API providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    SageMaker Hugging Face Inference Toolkit

    SageMaker Hugging Face Inference Toolkit

    Library for serving Transformers models on Amazon SageMaker

    SageMaker Hugging Face Inference Toolkit is an open-source library for serving Transformers models on Amazon SageMaker. This library provides default pre-processing, predict and postprocessing for certain Transformers models and tasks. It utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. For the Dockerfiles used for building SageMaker Hugging Face Containers, see AWS Deep Learning Containers. The SageMaker Hugging...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    LLaVA

    LLaVA

    Visual Instruction Tuning: Large Language-and-Vision Assistant

    Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    SageMaker Inference Toolkit

    SageMaker Inference Toolkit

    Serve machine learning models within a Docker container

    Serve machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. Once you have a trained model, you can include it in a Docker container that runs your inference code. A container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MyQ Print Management Software Icon
    MyQ Print Management Software

    SAVE TIME WITH PERSONALIZED PRINT SOLUTIONS

    Boost your digital or traditional workplace with MyQ’s secure print and scan solutions that respect your time and help you focus on what you do best.
    Learn More
  • 10
    SageMaker MXNet Inference Toolkit

    SageMaker MXNet Inference Toolkit

    Toolkit for allowing inference and serving with MXNet in SageMaker

    SageMaker MXNet Inference Toolkit is an open-source library for serving MXNet models on Amazon SageMaker. This library provides default pre-processing, predict and postprocessing for certain MXNet model types and utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Hugging Face Transformer

    Hugging Face Transformer

    CPU/GPU inference server for Hugging Face transformer models

    Optimize and deploy in production Hugging Face Transformer models in a single command line. At Lefebvre Dalloz we run in-production semantic search engines in the legal domain, in the non-marketing language it's a re-ranker, and we based ours on Transformer. In that setup, latency is key to providing a good user experience, and relevancy inference is done online for hundreds of snippets per user query. Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    BudgetML

    BudgetML

    Deploy a ML inference service on a budget in 10 lines of code

    Deploy a ML inference service on a budget in less than 10 lines of code. BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end. We built BudgetML because it's hard to find a simple way to get a model in production fast and cheaply. Deploying from scratch involves learning too many different concepts like SSL certificate generation, Docker, REST,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB