apache server monitor free download

OpenLLM

Operating LLMs in production

An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease. With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2， StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more. Serve LLMs over RESTful API or gRPC with one command, query via...

Downloads: 11 This Week

Last Update: 2025-04-21

See Project

LoRAX

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Lorax is a multi-LoRA (Low-Rank Adaptation) inference server that scales to thousands of fine-tuned Large Language Models (LLMs). It enables efficient deployment and management of numerous fine-tuned models, facilitating scalable AI applications. Lorax is designed to handle high concurrency and provides a robust infrastructure for serving multiple LLMs simultaneously.

Downloads: 5 This Week

Last Update: 2025-03-19

See Project

LazyLLM

Easiest and laziest way for building multi-agent LLMs applications

LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.

Downloads: 11 This Week

Last Update: 2026-03-04

See Project

Text Generation Inference

Large Language Model Text Generation Inference

Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.

Downloads: 11 This Week

Last Update: 2025-12-18

See Project

Seldon Core

An MLOps framework to package, deploy, monitor and manage models

The de facto standard open-source platform for rapidly deploying machine learning models on Kubernetes. Seldon Core, our open-source framework, makes it easier and faster to deploy your machine learning models and experiments at scale on Kubernetes. Seldon Core serves models built in any open-source or commercial model building framework. You can make use of powerful Kubernetes features like custom resource definitions to manage model graphs. And then connect your continuous integration and...

Downloads: 2 This Week

Last Update: 2026-01-23

See Project

SageMaker Hugging Face Inference Toolkit

Library for serving Transformers models on Amazon SageMaker

...SageMaker Hugging Face Inference Toolkit is licensed under the Apache 2.0 License.

Downloads: 4 This Week

Last Update: 2026-03-17

See Project

KServe

Standardized Serverless ML Inference Platform on Kubernetes

KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks. It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX. It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and...

Downloads: 10 This Week

Last Update: 2026-03-13

See Project

API-for-Open-LLM

Openai style api for open large language models

API-for-Open-LLM is a lightweight API server designed for deploying and serving open large language models (LLMs), offering a simple way to integrate LLMs into applications.

Downloads: 0 This Week

Last Update: 2025-01-22

See Project

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model

PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks in speech and audio, with state-of-art and influential models. Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. Low barriers to install, CLI, Server, and Streaming Server is available to quick-start your journey. We provide...

Downloads: 0 This Week

Last Update: 2025-03-04

See Project

LLaVA

Visual Instruction Tuning: Large Language-and-Vision Assistant

Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.

Downloads: 5 This Week

Last Update: 2024-02-04

See Project

SageMaker Inference Toolkit

Serve machine learning models within a Docker container

Serve machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. Once you have a trained model, you can include it in a Docker container that runs your inference code. A container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the...

Downloads: 0 This Week

Last Update: 2023-10-25

See Project

SageMaker MXNet Inference Toolkit

Toolkit for allowing inference and serving with MXNet in SageMaker

SageMaker MXNet Inference Toolkit is an open-source library for serving MXNet models on Amazon SageMaker. This library provides default pre-processing, predict and postprocessing for certain MXNet model types and utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep...

Downloads: 0 This Week

Last Update: 2022-07-05

See Project

Hugging Face Transformer

CPU/GPU inference server for Hugging Face transformer models

Optimize and deploy in production Hugging Face Transformer models in a single command line. At Lefebvre Dalloz we run in-production semantic search engines in the legal domain, in the non-marketing language it's a re-ranker, and we based ours on Transformer. In that setup, latency is key to providing a good user experience, and relevancy inference is done online for hundreds of snippets per user query. Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI....

Downloads: 1 This Week

Last Update: 2022-08-22

See Project

BudgetML

Deploy a ML inference service on a budget in 10 lines of code

Deploy a ML inference service on a budget in less than 10 lines of code. BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end. We built BudgetML because it's hard to find a simple way to get a model in production fast and cheaply. Deploying from scratch involves learning too many different concepts like SSL certificate generation, Docker, REST,...

Downloads: 0 This Week

Last Update: 2022-08-26

See Project

Search Results for "apache server monitor"

Showing 14 open source projects for "apache server monitor"

OpenLLM

LoRAX

LazyLLM

Text Generation Inference

Seldon Core

SageMaker Hugging Face Inference Toolkit

KServe

API-for-Open-LLM

PaddleSpeech

LLaVA

SageMaker Inference Toolkit

SageMaker MXNet Inference Toolkit

Hugging Face Transformer

BudgetML

Search Results for "apache server monitor"

Showing 14 open source projects for "apache server monitor"

OpenLLM

LoRAX

LazyLLM

Text Generation Inference

Seldon Core

SageMaker Hugging Face Inference Toolkit

KServe

API-for-Open-LLM

PaddleSpeech

LLaVA

SageMaker Inference Toolkit

SageMaker MXNet Inference Toolkit

Hugging Face Transformer

BudgetML

Related Searches

Related Categories