multi-threaded free download

AxonHub

Use any SDK to call 100+ LLMs

AxonHub is an open-source AI gateway platform designed to simplify the process of integrating and switching between different large language model providers. The system acts as a compatibility layer that allows developers to use the same SDK interface while routing requests to various AI services behind the scenes. Instead of rewriting code when switching providers such as OpenAI or Anthropic, developers can simply change configuration settings within the gateway. AxonHub translates requests...

Downloads: 31 This Week

Last Update: 3 days ago

See Project

KubeAI

Private Open AI on Kubernetes

Get inferencing running on Kubernetes: LLMs, Embeddings, Speech-to-Text. KubeAI serves an OpenAI compatible HTTP API. Admins can configure ML models by using the Model Kubernetes Custom Resources. KubeAI can be thought of as a Model Operator (See Operator Pattern) that manages vLLM and Ollama servers.

Downloads: 7 This Week

Last Update: 2026-03-31

See Project

LLaMA.go

llama.go is like llama.cpp in pure Golang

llama.go is like llama.cpp in pure Golang. The code of the project is based on the legendary ggml.cpp framework of Georgi Gerganov written in C++ with the same attitude to performance and elegance. Both models store FP32 weights, so you'll needs at least 32Gb of RAM (not VRAM or GPU RAM) for LLaMA-7B. Double to 64Gb for LLaMA-13B.

Downloads: 0 This Week

Last Update: 2023-08-25

See Project

Search Results for "multi-threaded"

Showing 3 open source projects for "multi-threaded"

AxonHub

KubeAI

LLaMA.go

Search Results for "multi-threaded"

Showing 3 open source projects for "multi-threaded"

AxonHub

KubeAI

LLaMA.go

Related Categories