api server free download

Showing 6 open source projects for "api server"

View related business solutions

Artificial Intelligence Rust Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Earn up to 16% annual interest with Nexo.
Let your crypto work for you

Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

Text Embeddings Inference

High-performance inference server for text embeddings models API layer

Text Embeddings Inference is a high-performance server designed to serve text embedding models efficiently in production environments. It focuses on delivering fast and scalable embedding generation by leveraging optimized inference techniques and modern hardware acceleration. It is built to support transformer-based embedding models, making it suitable for tasks such as semantic search, clustering, and retrieval-augmented systems.

Downloads: 7 This Week

Last Update: 2026-03-23
See Project
2

shimmy

Python-free Rust inference server

The shimmy project is a lightweight local inference server designed to run large language models with minimal overhead. Written primarily in Rust, the tool provides a small standalone binary that exposes an API compatible with the OpenAI interface, allowing existing applications to interact with local models without significant code changes. This compatibility enables developers to replace remote AI services with locally hosted models while keeping their existing software architecture intact. ...

Downloads: 3 This Week

Last Update: 2026-03-11
See Project
3

ort

Fast ML inference & training for ONNX models in Rust

ort is a high-performance Rust library that provides bindings to ONNX Runtime, enabling developers to run machine learning inference and training workflows directly within Rust applications using the standardized ONNX model format. It is designed to bridge the gap between modern machine learning frameworks and systems programming by offering a safe, ergonomic API for executing models originally built in ecosystems like PyTorch, TensorFlow, or scikit-learn. The library emphasizes speed and efficiency, leveraging hardware acceleration across CPUs, GPUs, and specialized accelerators to deliver low-latency inference both on-device and in server environments. One of its key strengths is its flexibility, as it supports multiple backends and allows developers to configure execution providers depending on available hardware. ort also includes advanced capabilities such as model compilation and optimization, reducing startup time and improving runtime performance in production systems.

Downloads: 6 This Week

Last Update: 2026-03-19
See Project
4

mistral.rs

Fast, flexible LLM inference

mistral.rs is a fast and flexible LLM inference engine implemented in Rust, designed to run and serve modern language models with an emphasis on performance and practical deployment. It provides multiple entry points for developers, including a CLI for running models locally and an HTTP server that exposes an OpenAI-compatible API surface for easy integration with existing clients. The project includes hardware-aware tooling that can benchmark a system and choose sensible quantization and device-mapping strategies, helping users get strong performance without manual tuning. It also supports serving multiple models from the same server process, enabling routing or quick switching between models depending on workload needs. ...

Downloads: 4 This Week

Last Update: 2026-04-02
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

Cog

Package and deploy machine learning models using Docker containers

...Cog also resolves compatibility issues between frameworks and GPU libraries by automatically selecting compatible combinations of CUDA, cuDNN, and machine learning frameworks such as PyTorch or TensorFlow. Cog automatically generates a RESTful HTTP API for running predictions, enabling models to be accessed programmatically through a built-in prediction server.

Downloads: 17 This Week

Last Update: 2026-04-02
See Project
6

Google Workspace CLI

Command-line tool for Drive, Gmail, Calendar, Sheets, Docs, Chat, etc.

Google Workspace CLI (gws) is a command-line tool designed to interact with Google Workspace services such as Drive, Gmail, Calendar, Sheets, and more from a single interface. It dynamically generates its command structure using Google’s Discovery Service, allowing it to automatically support new API endpoints as they become available. The tool eliminates the need for manual REST API calls by providing structured commands and built-in help for each resource and method. It outputs structured...

1 Review

Downloads: 27 This Week

Last Update: 2026-03-31
See Project