Showing 103 open source projects for "model-builder"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    OpenVINO Model Server

    OpenVINO Model Server

    A scalable inference server for models optimized with OpenVINO

    ...The system supports a wide range of model sources, letting you host models from local storage, remote object storage, or even pull from model hubs.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    TensorFlow Model Optimization Toolkit

    TensorFlow Model Optimization Toolkit

    A toolkit to optimize ML models for deployment for Keras & TensorFlow

    The TensorFlow Model Optimization Toolkit is a suite of tools for optimizing ML models for deployment and execution. Among many uses, the toolkit supports techniques used to reduce latency and inference costs for cloud and edge devices (e.g. mobile, IoT). Deploy models to edge devices with restrictions on processing, memory, power consumption, network usage, and model storage space.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    llama.cpp

    llama.cpp

    Port of Facebook's LLaMA model in C/C++

    The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.
    Downloads: 342 This Week
    Last Update:
    See Project
  • 4
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples. whisper.cpp supports integer quantization of the Whisper ggml models. ...
    Downloads: 387 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Open WebUI

    Open WebUI

    User-friendly AI Interface

    ...Additionally, Open WebUI offers a Progressive Web App (PWA) for mobile devices, providing offline access and a native app-like experience. The platform also includes a Model Builder, allowing users to create custom models from base Ollama models directly within the interface. With over 156,000 users, Open WebUI is a versatile solution for deploying and managing AI models in a secure, offline environment.
    Downloads: 171 This Week
    Last Update:
    See Project
  • 6
    EasyOCR

    EasyOCR

    Ready-to-use OCR with 80+ supported languages

    ...Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first generation models. EasyOCR will choose the latest model by default but you can also specify which model to use. Model weights for the chosen language will be automatically downloaded or you can download them manually from the model hub. The idea is to be able to plug-in any state-of-the-art model into EasyOCR. There are a lot of geniuses trying to make better detection/recognition models, but we are not trying to be geniuses here. ...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 7
    AIMET

    AIMET

    AIMET is a library that provides advanced quantization and compression

    ...Quantized inference is significantly faster than floating point inference. For example, models that we’ve run on the Qualcomm® Hexagon™ DSP rather than on the Qualcomm® Kryo™ CPU have resulted in a 5x to 15x speedup. Plus, an 8-bit model also has a 4x smaller memory footprint relative to a 32-bit model. However, often when quantizing a machine learning model (e.g., from 32-bit floating point to an 8-bit fixed point value), the model accuracy is sacrificed.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    ModelScope

    ModelScope

    Bring the notion of Model-as-a-Service to life

    ModelScope is built upon the notion of “Model-as-a-Service” (MaaS). It seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications. The core ModelScope library open-sourced in this repository provides the interfaces and implementations that allow developers to perform model inference, training and evaluation.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    Arize Phoenix

    Arize Phoenix

    Uncover insights, surface problems, monitor, and fine tune your LLM

    Phoenix provides ML insights at lightning speed with zero-config observability for model drift, performance, and data quality. Phoenix is an Open Source ML Observability library designed for the Notebook. The toolset is designed to ingest model inference data for LLMs, CV, NLP and tabular datasets. It allows Data Scientists to quickly visualize their model data, monitor performance, track down issues & insights, and easily export to improve.
    Downloads: 7 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Mosec

    Mosec

    A high-performance ML model serving framework, offers dynamic batching

    Mosec is a high-performance and flexible model-serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 11
    Oumi

    Oumi

    Everything you need to build state-of-the-art foundation models

    Oumi is an open-source framework that provides everything needed to build state-of-the-art foundation models, end-to-end. It aims to simplify the development of large-scale machine-learning models.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 12
    rwkv.cpp

    rwkv.cpp

    INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

    Besides the usual FP32, it supports FP16, quantized INT4, INT5 and INT8 inference. This project is focused on CPU, but cuBLAS is also supported. RWKV is a novel large language model architecture, with the largest model in the family having 14B parameters. In contrast to Transformer with O(n^2) attention, RWKV requires only state from the previous step to calculate logits. This makes RWKV very CPU-friendly on large context lengths.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    BentoML

    BentoML

    Unified Model Serving Framework

    BentoML simplifies ML model deployment and serves your models at a production scale. Support multiple ML frameworks natively: Tensorflow, PyTorch, XGBoost, Scikit-Learn and many more! Define custom serving pipeline with pre-processing, post-processing and ensemble models. Standard .bento format for packaging code, models and dependencies for easy versioning and deployment.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    LocalAI

    LocalAI

    The free, Open Source alternative to OpenAI, Claude and others

    ...It acts as a drop-in replacement for APIs such as OpenAI, enabling developers to build AI-powered applications without relying on external cloud services. The platform supports a wide range of model types, including text generation, image creation, speech processing, and embeddings. LocalAI can run on consumer-grade hardware and does not necessarily require a GPU, making it accessible for local development and private deployments. It integrates with multiple backends like llama.cpp, transformers, and diffusers to support different AI workloads. ...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 15
    OpenVINO Training Extensions

    OpenVINO Training Extensions

    Trainable models and NN optimization tools

    ...When ote_cli is installed in the virtual environment, you can use the ote command line interface to perform various actions for templates related to the chosen task type, such as running, training, evaluating, exporting, etc. ote train trains a model (a particular model template) on a dataset and saves results in two files. ote optimize optimizes a pre-trained model using NNCF or POT depending on the model format. NNCF optimization used for trained snapshots in a framework-specific format. POT optimization used for models exported in the OpenVINO IR format.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    TorchServe

    TorchServe

    Serve, optimize and scale PyTorch models in production

    TorchServe is a performant, flexible and easy-to-use tool for serving PyTorch eager mode and torschripted models. Multi-model management with the optimized worker to model allocation. REST and gRPC support for batched inference. Export your model for optimized inference. Torchscript out of the box, ORT, IPEX, TensorRT, FasterTransformer. Performance Guide: built-in support to optimize, benchmark and profile PyTorch and TorchServe performance. Expressive handlers: An expressive handler architecture that makes it trivial to support inferencing for your use case with many supported out of the box. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    LLamaSharp

    LLamaSharp

    C#/.NET binding of llama.cpp, including LLaMa/GPT model inference

    The C#/.NET binding of llama.cpp. It provides APIs to infer the LLaMa Models and deploy it on the local environment. It works on both Windows, Linux and MAC without the requirement for compiling llama.cpp yourself. Its performance is close to llama.cpp. Furthermore, it provides integrations with other projects such as BotSharp to provide higher-level applications and UI.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    hfapigo

    hfapigo

    Unofficial (Golang) Go bindings for the Hugging Face Inference API

    (Golang) Go bindings for the Hugging Face Inference API. Directly call any model available in the Model Hub. An API key is required for authorized access. To get one, create a Hugging Face profile.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    KubeAI

    KubeAI

    Private Open AI on Kubernetes

    Get inferencing running on Kubernetes: LLMs, Embeddings, Speech-to-Text. KubeAI serves an OpenAI compatible HTTP API. Admins can configure ML models by using the Model Kubernetes Custom Resources. KubeAI can be thought of as a Model Operator (See Operator Pattern) that manages vLLM and Ollama servers.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    NNCF

    NNCF

    Neural Network Compression Framework for enhanced OpenVINO

    NNCF (Neural Network Compression Framework) is an optimization toolkit for deep learning models, designed to apply quantization, pruning, and other techniques to improve inference efficiency.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    ncnn

    ncnn

    High-performance neural network inference framework for mobile

    ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...
    Downloads: 55 This Week
    Last Update:
    See Project
  • 22
    RamaLama

    RamaLama

    Simplifies the local serving of AI models from any source

    ...Developers can use familiar container commands to pull, run, and interact with AI models from any source, treating models similarly to how container images are handled in OCI workflows. RamaLama supports multiple model registries and offers a REST API or chatbot interface for interacting with running models, making it flexible for local development, testing, or integration into larger systems.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Triton Inference Server

    Triton Inference Server

    The Triton Inference Server provides an optimized cloud

    ...Provides Backend API that allows adding custom backends and pre/post-processing operations. Model pipelines using Ensembling or Business Logic Scripting (BLS). HTTP/REST and GRPC inference protocols based on the community-developed KServe protocol. A C API and Java API allow Triton to link directly into your application for edge and other in-process use cases.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    ONNX Runtime

    ONNX Runtime

    ONNX Runtime: cross-platform, high performance ML inferencing

    ...ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Support for a variety of frameworks, operating systems and hardware platforms. Built-in optimizations that deliver up to 17X faster inferencing and up to 1.4X faster training.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 25
    SageMaker Hugging Face Inference Toolkit

    SageMaker Hugging Face Inference Toolkit

    Library for serving Transformers models on Amazon SageMaker

    ...This library provides default pre-processing, predict and postprocessing for certain Transformers models and tasks. It utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. For the Dockerfiles used for building SageMaker Hugging Face Containers, see AWS Deep Learning Containers. The SageMaker Hugging Face Inference Toolkit implements various additional environment variables to simplify your deployment experience. The Hugging Face Inference Toolkit allows user to override the default methods of the HuggingFaceHandlerService. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next