Go LLM Inference Tools

View 135 business solutions

Browse free open source Go LLM Inference Tools and projects below. Use the toggles on the left to filter open source Go LLM Inference Tools by OS, license, language, programming language, and project status.

  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    LocalAI

    LocalAI

    The free, Open Source alternative to OpenAI, Claude and others

    LocalAI is an open-source platform that allows users to run large language models and other AI systems locally on their own hardware. It acts as a drop-in replacement for APIs such as OpenAI, enabling developers to build AI-powered applications without relying on external cloud services. The platform supports a wide range of model types, including text generation, image creation, speech processing, and embeddings. LocalAI can run on consumer-grade hardware and does not necessarily require a GPU, making it accessible for local development and private deployments. It integrates with multiple backends like llama.cpp, transformers, and diffusers to support different AI workloads. With its self-hosted architecture and OpenAI-compatible API, LocalAI enables developers to build secure, local-first AI applications.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 2
    Gitleaks

    Gitleaks

    Protect and discover secrets using Gitleaks

    Gitleaks is a fast, lightweight, portable, and open-source secret scanner for git repositories, files, and directories. With over 6.8 million docker downloads, 11.2k GitHub stars, 1.7 million GitHub Downloads, thousands of weekly clones, and over 400k homebrew installs, gitleaks is the most trusted secret scanner among security professionals, enterprises, and developers. Gitleaks-Action is our official GitHub Action. You can use it to automatically run a gitleaks scan on all your team's pull requests and commits, or run on-demand scans. If you are scanning repos that belong to a GitHub organization account, then you'll have to obtain a license. Gitleaks can be installed using Homebrew, Docker, or Go. Gitleaks is also available in binary form for many popular platforms and OS types on the releases page. In addition, Gitleaks can be implemented as a pre-commit hook directly in your repo or as a GitHub action using Gitleaks-Action.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 3
    KubeAI

    KubeAI

    Private Open AI on Kubernetes

    Get inferencing running on Kubernetes: LLMs, Embeddings, Speech-to-Text. KubeAI serves an OpenAI compatible HTTP API. Admins can configure ML models by using the Model Kubernetes Custom Resources. KubeAI can be thought of as a Model Operator (See Operator Pattern) that manages vLLM and Ollama servers.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    hfapigo

    hfapigo

    Unofficial (Golang) Go bindings for the Hugging Face Inference API

    (Golang) Go bindings for the Hugging Face Inference API. Directly call any model available in the Model Hub. An API key is required for authorized access. To get one, create a Hugging Face profile.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Featureform

    Featureform

    Turn your existing data infrastructure into a feature store

    Featureform allows data scientists to define, manage, and serve machine learning features across your organization. The days of untitled_128.ipynb are over. Transformations, features, and training sets can be pushed from notebooks to a centralized feature repository with metadata like name, variant, lineage, and owner. Featureform's Virtual Feature Store architecture orchestrates your data infrastructure to build and maintain your training sets and production features. It offers a framework with built-in feature versioning, lineage, orchestration, monitoring, and governance. Define your features once with Featureform, and we’ll orchestrate your transformation pipelines for both training and inference, across batch and streaming. All transformations and features are searchable, re-usable, and extensible. The days of sending notebooks and datasets over slack is over.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    spaGO

    spaGO

    Self-contained Machine Learning and Natural Language Processing lib

    A Machine Learning library written in pure Go designed to support relevant neural architectures in Natural Language Processing. Spago is self-contained, in that it uses its own lightweight computational graph both for training and inference, easy to understand from start to finish. The core module of Spago relies only on testify for unit testing. In other words, it has "zero dependencies", and we are committed to keeping it that way as much as possible. Spago uses a multi-module workspace to ensure that additional dependencies are downloaded only when specific features (e.g. persistent embeddings) are used. A good place to start is by looking at the implementation of built-in neural models, such as the LSTM. Except for a few linear algebra operations written in assembly for optimal performance (a bit of copying from Gonum), it's straightforward Go code, so you don't have to worry.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Beta9

    Beta9

    Run serverless GPU workloads with fast cold starts on bare-metal

    beta9 is a platform that enables running serverless GPU workloads with fast cold starts on bare-metal servers globally. It allows developers to deploy and scale GPU-accelerated applications without managing underlying infrastructure, offering flexibility and efficiency for AI and high-performance computing tasks. beta9 supports various frameworks and provides tools for monitoring and managing deployments effectively.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    LLaMA.go

    LLaMA.go

    llama.go is like llama.cpp in pure Golang

    llama.go is like llama.cpp in pure Golang. The code of the project is based on the legendary ggml.cpp framework of Georgi Gerganov written in C++ with the same attitude to performance and elegance. Both models store FP32 weights, so you'll needs at least 32Gb of RAM (not VRAM or GPU RAM) for LLaMA-7B. Double to 64Gb for LLaMA-13B.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB