Showing 20 open source projects for "openssh-server"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    OpenVINO Model Server

    OpenVINO Model Server

    A scalable inference server for models optimized with OpenVINO

    OpenVINO™ Model Server is a high-performance inference serving system designed to host and serve machine learning models that have been optimized with the OpenVINO toolkit. It’s implemented in C++ for scalability and efficiency, making it suitable for both edge and cloud deployments where inference workloads must be reliable and high throughput. The server exposes model inference via standard network protocols like REST and gRPC, allowing any client that speaks those protocols to request predictions remotely, abstracting away the complexity of where and how the model runs. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Triton Inference Server

    Triton Inference Server

    The Triton Inference Server provides an optimized cloud

    Triton Inference Server is an open-source inference serving software that streamlines AI inferencing. Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference across cloud, data center, edge, and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    LoRAX

    LoRAX

    Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

    Lorax is a multi-LoRA (Low-Rank Adaptation) inference server that scales to thousands of fine-tuned Large Language Models (LLMs). It enables efficient deployment and management of numerous fine-tuned models, facilitating scalable AI applications. Lorax is designed to handle high concurrency and provides a robust infrastructure for serving multiple LLMs simultaneously.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    LazyLLM

    LazyLLM

    Easiest and laziest way for building multi-agent LLMs applications

    LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 5
    BrowserAI

    BrowserAI

    Run local LLMs like llama, deepseek, kokoro etc. inside your browser

    BrowserAI is a cutting-edge platform that allows users to run large language models (LLMs) directly in their web browser without the need for a server. It leverages WebGPU for accelerated performance and supports offline functionality, making it a highly efficient and privacy-conscious solution. The platform provides a developer-friendly SDK with pre-configured popular models, and it allows for seamless switching between MLC and Transformer engines. Additionally, it supports features such as speech recognition, text-to-speech, structured output generation, and Web Worker support for non-blocking UI performance.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    PaddleSpeech

    PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model

    ...Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. Low barriers to install, CLI, Server, and Streaming Server is available to quick-start your journey. We provide high-speed and ultra-lightweight models, and also cutting-edge technology. We provide production ready streaming asr and streaming tts system. Our frontend contains Text Normalization and Grapheme-to-Phoneme (G2P, including Polyphone and Tone Sandhi). Moreover, we use self-defined linguistic rules to adapt Chinese context.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    DeepDetect

    DeepDetect

    Deep Learning API and Server in C++14 support for Caffe, PyTorch

    ...Training in a few hours and with small data thanks to 25+ pre-trained models. Full Open Source, with an ecosystem of tools (API clients, video, annotation, ...) Fast Server written in pure C++, a single codebase for Cloud, Desktop & Embedded.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Text Generation Inference

    Text Generation Inference

    Large Language Model Text Generation Inference

    Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    KServe

    KServe

    Standardized Serverless ML Inference Platform on Kubernetes

    ...It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX. It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and Canary Rollouts to your ML deployments. It enables a simple, pluggable, and complete story for Production ML Serving including prediction, pre-processing, post-processing and explainability. KServe is being used across various organizations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    API-for-Open-LLM

    API-for-Open-LLM

    Openai style api for open large language models

    API-for-Open-LLM is a lightweight API server designed for deploying and serving open large language models (LLMs), offering a simple way to integrate LLMs into applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    dstack

    dstack

    Open-source tool designed to enhance the efficiency of workloads

    dstack is an open-source tool designed to enhance the efficiency of running ML workloads in any cloud (AWS, GCP, Azure, Lambda, etc). It streamlines development and deployment, reduces cloud costs, and frees users from vendor lock-in.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Infinity

    Infinity

    Low-latency REST API for serving text-embeddings

    Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting all sentence-transformer models and frameworks. Infinity is developed under MIT License. Infinity powers inference behind Gradient.ai and other Embedding API providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    SageMaker Hugging Face Inference Toolkit

    SageMaker Hugging Face Inference Toolkit

    Library for serving Transformers models on Amazon SageMaker

    ...This library provides default pre-processing, predict and postprocessing for certain Transformers models and tasks. It utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. For the Dockerfiles used for building SageMaker Hugging Face Containers, see AWS Deep Learning Containers. The SageMaker Hugging Face Inference Toolkit implements various additional environment variables to simplify your deployment experience. The Hugging Face Inference Toolkit allows user to override the default methods of the HuggingFaceHandlerService. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    kolosal

    kolosal

    Open Source and Lightweight Local LLM Platform

    Kolosal AI is the leading open-source local LLM platform. Download, train, and run local LLM models on your device with no cloud dependencies. An opensource and lightweight alternative to LM Studio.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    Openfire LLM Chatbot Plugin

    Openfire LLM Chatbot Plugin

    LLM Chatbot Assistant for Openfire server

    This plugin is a wrapper to hosted AI Inference server for LLM chat models. It uses the Botz API to create a chatbot in Openfire which will engage in XMPP chat and groupchat conversations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    LLaVA

    LLaVA

    Visual Instruction Tuning: Large Language-and-Vision Assistant

    Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    SageMaker Inference Toolkit

    SageMaker Inference Toolkit

    Serve machine learning models within a Docker container

    ...The SageMaker Inference Toolkit implements a model serving stack and can be easily added to any Docker container, making it deployable to SageMaker. This library's serving stack is built on Multi Model Server, and it can serve your own models or those you trained on SageMaker using machine learning frameworks with native SageMaker support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    SageMaker MXNet Inference Toolkit

    SageMaker MXNet Inference Toolkit

    Toolkit for allowing inference and serving with MXNet in SageMaker

    ...This library provides default pre-processing, predict and postprocessing for certain MXNet model types and utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. AWS Deep Learning Containers (DLCs) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch, and MXNet. Deep Learning Containers provide optimized environments with TensorFlow and MXNet, Nvidia CUDA (for GPU instances), and Intel MKL (for CPU instances) libraries and are available in the Amazon Elastic Container Registry (Amazon ECR). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Hugging Face Transformer

    Hugging Face Transformer

    CPU/GPU inference server for Hugging Face transformer models

    ...Most tutorials on Transformer deployment in production are built over Pytorch and FastAPI. Both are great tools but not very performant in inference. Then, if you spend some time, you can build something over ONNX Runtime and Triton inference server. You will usually get from 2X to 4X faster inference compared to vanilla Pytorch. It's cool! However, if you want the best in class performances on GPU, there is only a single possible combination: Nvidia TensorRT and Triton. You will usually get 5X faster inference compared to vanilla Pytorch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    BudgetML

    BudgetML

    Deploy a ML inference service on a budget in 10 lines of code

    ...It is supposed to be fast, easy, and developer-friendly. It is by no means meant to be used in a full-fledged production-ready setup. It is simply a means to get a server up and running as fast as possible with the lowest costs possible.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB