open-shell free download

API-for-Open-LLM

Openai style api for open large language models

API-for-Open-LLM is a lightweight API server designed for deploying and serving open large language models (LLMs), offering a simple way to integrate LLMs into applications.

Downloads: 0 This Week

Last Update: 2025-01-22

See Project

GPT4All

Run Local LLMs on Any Device. Open-source

GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware.

1 Review

Downloads: 148 This Week

Last Update: 2025-03-17

See Project

LMDeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs

LMDeploy is a toolkit designed for compressing, deploying, and serving large language models (LLMs). It offers tools and workflows to optimize LLMs for production environments, ensuring efficient performance and scalability. LMDeploy supports various model architectures and provides deployment solutions across different platforms.

Downloads: 37 This Week

Last Update: 5 days ago

See Project

vLLM

A high-throughput and memory-efficient inference and serving engine

vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.

Downloads: 34 This Week

Last Update: 2026-04-03

See Project

FlashInfer

FlashInfer: Kernel Library for LLM Serving

FlashInfer is a kernel library designed to enhance the serving of Large Language Models (LLMs) by optimizing inference performance. It provides a high-performance framework that integrates seamlessly with existing systems, aiming to reduce latency and improve efficiency in LLM deployments. FlashInfer supports various hardware architectures and is built to scale with the demands of production environments.

Downloads: 22 This Week

Last Update: 7 days ago

See Project

Transformer Engine

A library for accelerating Transformer models on NVIDIA GPUs

Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference. TE provides a collection of highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that can be used seamlessly with your framework-specific code. TE also includes a framework-agnostic C++...

Downloads: 27 This Week

Last Update: 2026-03-31

See Project

huggingface_hub

The official Python client for the Huggingface Hub

The huggingface_hub library allows you to interact with the Hugging Face Hub, a platform democratizing open-source Machine Learning for creators and collaborators. Discover pre-trained models and datasets for your projects or play with the thousands of machine-learning apps hosted on the Hub. You can also create and share your own models, datasets, and demos with the community. The huggingface_hub library provides a simple way to do all these things with Python.

Downloads: 16 This Week

Last Update: 3 days ago

See Project

Genv

GPU environment management and cluster orchestration

Genv is an open-source environment and cluster management system for GPUs. Genv lets you easily control, configure, monitor and enforce the GPU resources that you are using in a GPU machine or cluster. It is intended to ease up the process of GPU allocation for data scientists without code changes.

Downloads: 17 This Week

Last Update: 2024-05-16

See Project

Oumi

Everything you need to build state-of-the-art foundation models

Oumi is an open-source framework that provides everything needed to build state-of-the-art foundation models, end-to-end. It aims to simplify the development of large-scale machine-learning models.

Downloads: 11 This Week

Last Update: 2026-01-28

See Project

EasyOCR

Ready-to-use OCR with 80+ supported languages

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first...

Downloads: 19 This Week

Last Update: 2024-09-24

See Project

NNCF

Neural Network Compression Framework for enhanced OpenVINO

NNCF (Neural Network Compression Framework) is an optimization toolkit for deep learning models, designed to apply quantization, pruning, and other techniques to improve inference efficiency.

Downloads: 8 This Week

Last Update: 4 days ago

See Project

Triton Inference Server

The Triton Inference Server provides an optimized cloud

Triton Inference Server is an open-source inference serving software that streamlines AI inferencing. Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference across cloud, data center, edge, and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia.

Downloads: 11 This Week

Last Update: 2 days ago

See Project

Text Generation Inference

Large Language Model Text Generation Inference

Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.

Downloads: 9 This Week

Last Update: 2025-12-18

See Project

RamaLama

Simplifies the local serving of AI models from any source

RamaLama is an open-source developer tool that simplifies working with and serving AI models locally or in production by leveraging container technologies like Docker, Podman, and OCI registries, allowing AI inference workflows to be treated like standard container deployments. It abstracts away much of the complexity of configuring AI runtimes, dependencies, and hardware optimizations by detecting available GPUs (or falling back to CPU) and automatically pulling a container image pre-configured for the detected hardware environment. ...

Downloads: 9 This Week

Last Update: 2026-03-19

See Project

AIMET

AIMET is a library that provides advanced quantization and compression

Qualcomm Innovation Center (QuIC) is at the forefront of enabling low-power inference at the edge through its pioneering model-efficiency research. QuIC has a mission to help migrate the ecosystem toward fixed-point inference. With this goal, QuIC presents the AI Model Efficiency Toolkit (AIMET) - a library that provides advanced quantization and compression techniques for trained neural network models. AIMET enables neural networks to run more efficiently on fixed-point AI hardware...

Downloads: 13 This Week

Last Update: 6 days ago

See Project

Adversarial Robustness Toolbox

Adversarial Robustness Toolbox (ART) - Python Library for ML security

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference. ART supports all popular machine learning frameworks (TensorFlow, Keras, PyTorch, MXNet, sci-kit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.), all data types (images, tables, audio,...

Downloads: 9 This Week

Last Update: 2025-07-07

See Project

Mistral Inference

Official inference library for Mistral models

Open and portable generative AI for devs and businesses. We release open-weight models for everyone to customize and deploy where they want it. Our super-efficient model Mistral Nemo is available under Apache 2.0, while Mistral Large 2 is available through both a free non-commercial license, and a commercial license.

Downloads: 6 This Week

Last Update: 2025-03-20

See Project

LazyLLM

Easiest and laziest way for building multi-agent LLMs applications

LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.

Downloads: 8 This Week

Last Update: 2026-03-04

See Project

optillm

Optimizing inference proxy for LLMs

OptiLLM is an optimizing inference proxy for Large Language Models (LLMs) that implements state-of-the-art techniques to enhance performance and efficiency. It serves as an OpenAI API-compatible proxy, allowing for seamless integration into existing workflows while optimizing inference processes. OptiLLM aims to reduce latency and resource consumption during LLM inference.

Downloads: 6 This Week

Last Update: 2026-03-19

See Project

OpenLLM

Operating LLMs in production

An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease. With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2， StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more.

Downloads: 5 This Week

Last Update: 2025-04-21

See Project

ModelScope

Bring the notion of Model-as-a-Service to life

...It seeks to bring together most advanced machine learning models from the AI community, and streamlines the process of leveraging AI models in real-world applications. The core ModelScope library open-sourced in this repository provides the interfaces and implementations that allow developers to perform model inference, training and evaluation. In particular, with rich layers of API abstraction, the ModelScope library offers unified experience to explore state-of-the-art models spanning across domains such as CV, NLP, Speech, Multi-Modality, and Scientific-computation. ...

Downloads: 8 This Week

Last Update: 2 days ago

See Project

SetFit

Efficient few-shot learning with Sentence Transformers

SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers. It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples.

Downloads: 7 This Week

Last Update: 2025-08-05

See Project

Causal ML

Uplift modeling and causal inference with machine learning algorithms

Causal ML is a Python package that provides a suite of uplift modeling and causal inference methods using machine learning algorithms based on recent research [1]. It provides a standard interface that allows users to estimate the Conditional Average Treatment Effect (CATE) or Individual Treatment Effect (ITE) from experimental or observational data. Essentially, it estimates the causal impact of intervention T on outcome Y for users with observed features X, without strong assumptions on...

Downloads: 8 This Week

Last Update: 2026-02-06

See Project

DoWhy

DoWhy is a Python library for causal inference

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks. Much like machine learning libraries have done for prediction, DoWhy is a Python library that aims to spark causal thinking and analysis. DoWhy provides a wide variety of algorithms for effect estimation, causal structure learning, diagnosis of causal...

Downloads: 8 This Week

Last Update: 2025-11-03

See Project

LLM Foundry

LLM training code for MosaicML foundation models

...This has led to a flurry of activity centered on open-source LLMs, such as the LLaMA series from Meta, the Pythia series from EleutherAI, the StableLM series from StabilityAI, and the OpenLLaMA model from Berkeley AI Research.

Downloads: 5 This Week

Last Update: 2025-07-29

See Project

Search Results for "open-shell"

Showing 100 open source projects for "open-shell"

API-for-Open-LLM

GPT4All

LMDeploy

vLLM

FlashInfer

Transformer Engine

huggingface_hub

Genv

Oumi

EasyOCR

NNCF

Triton Inference Server

Text Generation Inference

RamaLama

AIMET

Adversarial Robustness Toolbox

Mistral Inference

LazyLLM

optillm

OpenLLM

ModelScope

SetFit

Causal ML

DoWhy

LLM Foundry

Search Results for "open-shell"

Showing 100 open source projects for "open-shell"

API-for-Open-LLM

GPT4All

LMDeploy

vLLM

FlashInfer

Transformer Engine

huggingface_hub

Genv

Oumi

EasyOCR

NNCF

Triton Inference Server

Text Generation Inference

RamaLama

AIMET

Adversarial Robustness Toolbox

Mistral Inference

LazyLLM

optillm

OpenLLM

ModelScope

SetFit

Causal ML

DoWhy

LLM Foundry

Related Searches

Related Categories