Showing 25 open source projects for "hardware"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Host LLMs in Production With On-Demand GPUs Icon
    Host LLMs in Production With On-Demand GPUs

    NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

    Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.
    Try Free
  • 1
    GPT4All

    GPT4All

    Run Local LLMs on Any Device. Open-source

    ...The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This project also supports Python integrations for easy automation and customization. GPT4All is ideal for individuals and businesses seeking private, offline access to powerful LLMs.
    Downloads: 105 This Week
    Last Update:
    See Project
  • 2
    LocalAI

    LocalAI

    The free, Open Source alternative to OpenAI, Claude and others

    LocalAI is an open-source platform that allows users to run large language models and other AI systems locally on their own hardware. It acts as a drop-in replacement for APIs such as OpenAI, enabling developers to build AI-powered applications without relying on external cloud services. The platform supports a wide range of model types, including text generation, image creation, speech processing, and embeddings. LocalAI can run on consumer-grade hardware and does not necessarily require a GPU, making it accessible for local development and private deployments. ...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 3
    ONNX Runtime

    ONNX Runtime

    ONNX Runtime: cross-platform, high performance ML inferencing

    ...Support for a variety of frameworks, operating systems and hardware platforms. Built-in optimizations that deliver up to 17X faster inferencing and up to 1.4X faster training.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 4
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    ...The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. Quantized models require less memory and disk space and depending on the hardware can be processed more efficiently.
    Downloads: 537 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 5
    FlashInfer

    FlashInfer

    FlashInfer: Kernel Library for LLM Serving

    ...It provides a high-performance framework that integrates seamlessly with existing systems, aiming to reduce latency and improve efficiency in LLM deployments. FlashInfer supports various hardware architectures and is built to scale with the demands of production environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    LLM.swift

    LLM.swift

    LLM.swift is a simple and readable library

    LLM.swift is a Swift package that enables developers to run Large Language Models (LLMs) directly on Apple devices, including iOS, macOS, and watchOS. By leveraging Apple's hardware and software optimizations, LLM.swift facilitates on-device natural language processing tasks, ensuring user privacy and reducing latency associated with cloud-based solutions.​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    ChatGLM.cpp

    ChatGLM.cpp

    C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

    ChatGLM.cpp is a C++ implementation of the ChatGLM-6B model, enabling efficient local inference without requiring a Python environment. It is optimized for running on consumer hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MIVisionX

    MIVisionX

    Set of comprehensive computer vision & machine intelligence libraries

    ...AMD MIVisionX delivers highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions along with Convolution Neural Net Model Compiler & Optimizer supporting ONNX, and Khronos NNEF™ exchange formats. The toolkit allows for rapid prototyping and deployment of optimized computer vision and machine learning inference workloads on a wide range of computer hardware, including small embedded x86 CPUs, APUs, discrete GPUs, and heterogeneous servers. AMD OpenVX is a highly optimized open-source implementation of the Khronos OpenVX™ 1.3 computer vision specification. It allows for rapid prototyping as well as fast execution on a wide range of computer hardware, including small embedded x86 CPUs and large workstation discrete GPUs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    Bolt NLP

    Bolt NLP

    Bolt is a deep learning library with high performance

    Bolt is a high-performance deep learning inference framework developed by Huawei Noah's Ark Lab. It is designed to optimize and accelerate the deployment of deep learning models across various hardware platforms. Bolt is a light-weight library for deep learning. Bolt, as a universal deployment tool for all kinds of neural networks, aims to automate the deployment pipeline and achieve extreme acceleration. Bolt has been widely deployed and used in many departments of HUAWEI company, such as 2012 Laboratory, CBG and HUAWEI Product Lines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    RamaLama

    RamaLama

    Simplifies the local serving of AI models from any source

    RamaLama is an open-source developer tool that simplifies working with and serving AI models locally or in production by leveraging container technologies like Docker, Podman, and OCI registries, allowing AI inference workflows to be treated like standard container deployments. It abstracts away much of the complexity of configuring AI runtimes, dependencies, and hardware optimizations by detecting available GPUs (or falling back to CPU) and automatically pulling a container image pre-configured for the detected hardware environment. Developers can use familiar container commands to pull, run, and interact with AI models from any source, treating models similarly to how container images are handled in OCI workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ONNX

    ONNX

    Open standard for machine learning interoperability

    ...It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types. Currently we focus on the capabilities needed for inferencing (scoring). ONNX is widely supported and can be found in many frameworks, tools, and hardware. Enabling interoperability between different frameworks and streamlining the path from research to production helps increase the speed of innovation in the AI community.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime,...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 14
    ExecuTorch

    ExecuTorch

    On-device AI across mobile, embedded and edge for PyTorch

    ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    NNCF

    NNCF

    Neural Network Compression Framework for enhanced OpenVINO

    NNCF (Neural Network Compression Framework) is an optimization toolkit for deep learning models, designed to apply quantization, pruning, and other techniques to improve inference efficiency.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    gemma.cpp

    gemma.cpp

    lightweight, standalone C++ inference engine for Google's Gemma models

    Gemma.cpp is a C++ implementation for running inference with Gemma models efficiently on CPUs and GPUs. Developed by Google, it allows running large language models (LLMs) like Gemma with minimal hardware, focusing on optimized performance and low latency. Gemma.cpp is intended for developers seeking to deploy LLMs in production environments without needing massive computational resources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    TensorFlow Model Optimization Toolkit

    TensorFlow Model Optimization Toolkit

    A toolkit to optimize ML models for deployment for Keras & TensorFlow

    ...Deploy models to edge devices with restrictions on processing, memory, power consumption, network usage, and model storage space. Enable execution on and optimize for existing hardware or new special purpose accelerators. Choose the model and optimization tool depending on your task. In many cases, pre-optimized models can improve the efficiency of your application. Try the post-training tools to optimize an already-trained TensorFlow model. Use training-time optimization tools and learn about the techniques.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    PEFT

    PEFT

    State-of-the-art Parameter-Efficient Fine-Tuning

    Parameter-Efficient Fine-Tuning (PEFT) methods enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters. Fine-tuning large-scale PLMs is often prohibitively costly. In this regard, PEFT methods only fine-tune a small number of (extra) model parameters, thereby greatly decreasing the computational and storage costs. Recent State-of-the-Art PEFT techniques achieve performance comparable to that of full...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    AIMET

    AIMET

    AIMET is a library that provides advanced quantization and compression

    ...With this goal, QuIC presents the AI Model Efficiency Toolkit (AIMET) - a library that provides advanced quantization and compression techniques for trained neural network models. AIMET enables neural networks to run more efficiently on fixed-point AI hardware accelerators. Quantized inference is significantly faster than floating point inference. For example, models that we’ve run on the Qualcomm® Hexagon™ DSP rather than on the Qualcomm® Kryo™ CPU have resulted in a 5x to 15x speedup. Plus, an 8-bit model also has a 4x smaller memory footprint relative to a 32-bit model. However, often when quantizing a machine learning model (e.g., from 32-bit floating point to an 8-bit fixed point value), the model accuracy is sacrificed.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    TensorFlow Probability

    TensorFlow Probability

    Probabilistic reasoning and statistical analysis in TensorFlow

    TensorFlow Probability is a library for probabilistic reasoning and statistical analysis. TensorFlow Probability (TFP) is a Python library built on TensorFlow that makes it easy to combine probabilistic models and deep learning on modern hardware (TPU, GPU). It's for data scientists, statisticians, ML researchers, and practitioners who want to encode domain knowledge to understand data and make predictions. Since TFP inherits the benefits of TensorFlow, you can build, fit, and deploy a model using a single language throughout the lifecycle of model exploration and production. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Xorbits Inference

    Xorbits Inference

    Replace OpenAI GPT with another LLM in your app

    Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop. Xorbits Inference(Xinference) is a powerful and versatile library designed to serve language, speech recognition, and multimodal models. With Xorbits...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    MegEngine

    MegEngine

    Easy-to-use deep learning framework with 3 key features

    MegEngine is a fast, scalable and easy-to-use deep learning framework with 3 key features. You can represent quantization/dynamic shape/image pre-processing and even derivation in one model. After training, just put everything into your model and inference it on any platform at ease. Speed and precision problems won't bother you anymore due to the same core inside. In training, GPU memory usage could go down to one-third at the cost of only one additional line, which enables the DTR...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 23
    AutoGPTQ

    AutoGPTQ

    An easy-to-use LLMs quantization package with user-friendly apis

    AutoGPTQ is an implementation of GPTQ (Quantized GPT) that optimizes large language models (LLMs) for faster inference by reducing their computational footprint while maintaining accuracy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Autodistill

    Autodistill

    Images to inference with no labeling

    ...Using autodistill, you can go from unlabeled images to inference on a custom model running at the edge with no human intervention in between. You can use Autodistill on your own hardware, or use the Roboflow hosted version of Autodistill to label images in the cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    EvaDB

    EvaDB

    Database system for building simpler and faster AI-powered application

    ...For example, the state-of-the-art object detection model takes multiple GPU years to process just a week’s videos from a single traffic monitoring camera. Besides the money spent on hardware, these models also increase the time that you spend waiting for the model inference to finish.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo