Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
LLM Inference Tools
Search Results

Search Results for "python::module" - Page 4

x

Sort By:

Relevance

Clear All Filters

OS

Windows 103
Mac 101
Linux 100
More...
ChromeOS 9
BSD 8
Mobile Operating Systems 1

Category

Artificial Intelligence 110
Software Development 20
Business 3
System 2
Database 1
Education 1
Formats and Protocols 1
Multimedia 1

License

OSI-Approved Open Source 109

Translations

English 3

Programming Language

Python 99
C++ 10
JavaScript 2
C 1
More...
Go 1
Rust 1

Status

Production/Stable 2

Showing 110 open source projects for "python::module"

View related business solutions

LLM Inference Clear Filters & Widen Search

Grafana: The open and composable observability platform
Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

Grafana is the open source analytics & monitoring solution for every database.

Learn More
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

TensorFlow Model Optimization Toolkit

A toolkit to optimize ML models for deployment for Keras & TensorFlow

The TensorFlow Model Optimization Toolkit is a suite of tools for optimizing ML models for deployment and execution. Among many uses, the toolkit supports techniques used to reduce latency and inference costs for cloud and edge devices (e.g. mobile, IoT). Deploy models to edge devices with restrictions on processing, memory, power consumption, network usage, and model storage space. Enable execution on and optimize for existing hardware or new special purpose accelerators. Choose the model...

Downloads: 0 This Week

Last Update: 2024-02-08
See Project
2

Infinity

Low-latency REST API for serving text-embeddings

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting all sentence-transformer models and frameworks. Infinity is developed under MIT License. Infinity powers inference behind Gradient.ai and other Embedding API providers.

Downloads: 0 This Week

Last Update: 2025-08-22
See Project
3

Transformer Engine

A library for accelerating Transformer models on NVIDIA GPUs

Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference. TE provides a collection of highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that can be used seamlessly with your framework-specific code. TE also includes a framework-agnostic C++...

Downloads: 0 This Week

Last Update: 2025-10-31
See Project
4

KServe

Standardized Serverless ML Inference Platform on Kubernetes

KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks. It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX. It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and...

Downloads: 0 This Week

Last Update: 2025-11-03
See Project
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
5

OpenFold

Trainable, memory-efficient, and GPU-friendly PyTorch reproduction

OpenFold carefully reproduces (almost) all of the features of the original open source inference code (v2.0.1). The sole exception is model ensembling, which fared poorly in DeepMind's own ablation testing and is being phased out in future DeepMind experiments. It is omitted here for the sake of reducing clutter. In cases where the Nature paper differs from the source, we always defer to the latter. OpenFold is trainable in full precision, half precision, or bfloat16 with or without...

Downloads: 0 This Week

Last Update: 2025-04-26
See Project
6

Xorbits Inference

Replace OpenAI GPT with another LLM in your app

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop. Xorbits Inference(Xinference) is a powerful and versatile library designed to serve language, speech recognition, and multimodal models. With Xorbits...

Downloads: 0 This Week

Last Update: 2025-11-14
See Project
7

LLM Foundry

LLM training code for MosaicML foundation models

Introducing MPT-7B, the first entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Large language models (LLMs) are changing the world, but for those outside well-resourced industry labs, it can be extremely difficult to train and deploy...

Downloads: 0 This Week

Last Update: 2025-07-29
See Project
8

SAHI

A lightweight vision library for performing large object detection

A lightweight vision library for performing large-scale object detection & instance segmentation. Object detection and instance segmentation are by far the most important fields of applications in Computer Vision. However, detection of small objects and inference on large images are still major issues in practical usage. Here comes the SAHI to help developers overcome these real-world problems with many vision utilities. Detection of small objects and objects far away in the scene is a major...

Downloads: 0 This Week

Last Update: 2025-09-28
See Project
9

SageMaker Hugging Face Inference Toolkit

Library for serving Transformers models on Amazon SageMaker

SageMaker Hugging Face Inference Toolkit is an open-source library for serving Transformers models on Amazon SageMaker. This library provides default pre-processing, predict and postprocessing for certain Transformers models and tasks. It utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. For the Dockerfiles used for building SageMaker Hugging Face Containers, see AWS Deep Learning Containers. The SageMaker Hugging...

Downloads: 0 This Week

Last Update: 2025-04-23
See Project
G-P - Global EOR Solution
Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.

Learn More
10

DeepSpeed

Deep learning optimization library: makes distributed training easy

DeepSpeed is an easy-to-use deep learning optimization software suite that enables unprecedented scale and speed for Deep Learning Training and Inference. With DeepSpeed you can: 1. Train/Inference dense or sparse models with billions or trillions of parameters 2. Achieve excellent system throughput and efficiently scale to thousands of GPUs 3. Train/Inference on resource constrained GPU systems 4. Achieve unprecedented low latency and high throughput for inference 5. Achieve extreme...

Downloads: 1 This Week

Last Update: 2025-11-04
See Project
11

UForm

Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion

UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space! It comes with a set of homonymous pre-trained networks available on HuggingFace portal and extends the transfromers package to support Mid-fusion Models. Late-fusion models encode each modality independently, but into one shared vector space. Due to independent encoding late-fusion models are good at capturing coarse-grained...

Downloads: 0 This Week

Last Update: 2025-10-30
See Project
12

marqo

Tensor search for humans

A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and...

Downloads: 0 This Week

Last Update: 4 days ago
See Project
13

AWS Neuron

Powering Amazon custom machine learning chips

AWS Neuron is a software development kit (SDK) for running machine learning inference using AWS Inferentia chips. It consists of a compiler, run-time, and profiling tools that enable developers to run high-performance and low latency inference using AWS Inferentia-based Amazon EC2 Inf1 instances. Using Neuron developers can easily train their machine learning models on any popular framework such as TensorFlow, PyTorch, and MXNet, and run it optimally on Amazon EC2 Inf1 instances. You can...

Downloads: 0 This Week

Last Update: 2025-10-29
See Project
14

OpenMLDB

OpenMLDB is an open-source machine learning database

...Real-time features are essential for many machine learning applications, such as real-time personalized recommendations and risk analytics. However, a feature engineering script developed by data scientists (Python scripts in most cases) cannot be directly deployed into production for online inference because it usually cannot meet the engineering requirements, such as low latency, high throughput and high availability.

Downloads: 0 This Week

Last Update: 2025-02-21
See Project
15

DALI

A GPU-accelerated library containing highly optimized building blocks

The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding,...

Downloads: 0 This Week

Last Update: 2025-10-16
See Project
16

spaGO

Self-contained Machine Learning and Natural Language Processing lib

...Spago is self-contained, in that it uses its own lightweight computational graph both for training and inference, easy to understand from start to finish. The core module of Spago relies only on testify for unit testing. In other words, it has "zero dependencies", and we are committed to keeping it that way as much as possible. Spago uses a multi-module workspace to ensure that additional dependencies are downloaded only when specific features (e.g. persistent embeddings) are used. A good place to start is by looking at the implementation of built-in neural models, such as the LSTM. ...

Downloads: 0 This Week

Last Update: 2023-10-30
See Project
17

SageMaker Inference Toolkit

Serve machine learning models within a Docker container

Serve machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. Once you have a trained model, you can include it in a Docker container that runs your inference code. A container provides an effectively isolated environment, ensuring a consistent runtime regardless of where the...

Downloads: 0 This Week

Last Update: 2023-10-25
See Project
18

LLMFlows

LLMFlows - Simple, Explicit and Transparent LLM Apps

LLMFlows is a framework for building simple, explicit, and transparent applications utilizing Large Language Models (LLMs). It emphasizes clarity and control in the development process, allowing developers to create LLM-powered applications with well-defined workflows and interactions. LLMFlows supports various LLMs and provides tools to manage prompts, responses, and application logic effectively.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
19

llama2-webui

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere

Running Llama 2 with gradio web UI on GPU or CPU from anywhere (Linux/Windows/Mac).

Downloads: 0 This Week

Last Update: 2023-10-04
See Project
20

Medusa

Framework for Accelerating LLM Generation with Multiple Decoding Heads

Medusa is a framework aimed at accelerating the generation capabilities of Large Language Models (LLMs) by employing multiple decoding heads. This approach allows for parallel processing during text generation, significantly enhancing throughput and reducing response times. Medusa is designed to be simple to implement and integrates with existing LLM infrastructures, making it a practical solution for scaling LLM applications.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
21

ollama_manager_gui

A graphical manager for ollama that can manage your LLMs

This app will help install ollama and LLMs using the gui provided by this app. It checks for ollama when launched and if it doesn't exist it will help by bringing you to the ollama site for download. This app is heavily upgraded and now also works properly on Linux. It now has progress bars and many many many improvements. It can launch the LLM by clicking the link. it can launch multiple LLMs in separate windows. It can also remove an installed LLM. There is a confirmation...

Downloads: 3 This Week

Last Update: 2025-08-14
See Project
22

Petals

Run 100B+ language models at home, BitTorrent-style

Run 100B+ language models at home, BitTorrent‑style. Run large language models like BLOOM-176B collaboratively — you load a small part of the model, then team up with people serving the other parts to run inference or fine-tuning. Single-batch inference runs at ≈ 1 sec per step (token) — up to 10x faster than offloading, enough for chatbots and other interactive apps. Parallel inference reaches hundreds of tokens/sec. Beyond classic language model APIs — you can employ any fine-tuning and...

Downloads: 0 This Week

Last Update: 2023-09-06
See Project
23

OpenFieldAI - AI Open Field Test Tracker

OpenFieldAI is an AI based Open Field Test Rodent Tracker

OpenFieldAI use AI-CNN to track rodents movement with pretrained OFAI models , or user could create their own model with YOLOv8 for inferencing. The software generates Centroid graph, Heat map and Line path and a spreadsheet containing all calculated parameters like - Speed - Time in and out of ROI - Distance - Entries/Exits for single/multiple pre-recorded videos or live webcam video. The ROI is assigned automatically in multiple video input , and can be manually given in single...

Downloads: 0 This Week

Last Update: 2025-10-21
See Project
24

pipeless

A computer vision framework to create and deploy apps in minutes

Pipeless is an open-source computer vision framework to create and deploy applications without the complexity of building and maintaining multimedia pipelines. It ships everything you need to create and deploy efficient computer vision applications that work in real-time in just minutes. Pipeless is inspired by modern serverless technologies. It provides the development experience of serverless frameworks applied to computer vision. You provide some functions that are executed for new...

Downloads: 1 This Week

Last Update: 2024-02-23
See Project
25

Lightning Bolts

Toolbox of models, callbacks, and datasets for AI/ML researchers

Bolts package provides a variety of components to extend PyTorch Lightning, such as callbacks & datasets, for applied research and production. Torch ORT converts your model into an optimized ONNX graph, speeding up training & inference when using NVIDIA or AMD GPUs. We can introduce sparsity during fine-tuning with SparseML, which ultimately allows us to leverage the DeepSparse engine to see performance improvements at inference time.

Downloads: 1 This Week

Last Update: 2024-08-15
See Project

Previous
1
2
3
You're on page 4
5
Next

Related Searches

cuda machine learning

sahi

face recognition software

ollama

ai

surveillance

ollama gui

ai open chat

Related Categories

Artificial Intelligence

Software Development

Business

System

Database

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2025 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

×

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: