Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
LLM Inference Tools
Search Results

Search Results for "windows" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Windows 94
Mac 88
Linux 86
More...
ChromeOS 8
BSD 7
Mobile Operating Systems 1

Category

Artificial Intelligence 94
Software Development 17
Business 3
System 2
Database 1
Formats and Protocols 1
Multimedia 1

License

OSI-Approved Open Source 94

Translations

English 3

Programming Language

Python 94
C++ 2
JavaScript 1
Rust 1

Status

Production/Stable 2

Showing 94 open source projects for "windows"

View related business solutions

LLM Inference Python Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

LitGPT

20+ high-performance LLMs with recipes to pretrain, finetune at scale

LitGPT is a collection of over 20 high-performance large language models (LLMs) accompanied by recipes to pretrain, finetune, and deploy them at scale. It provides implementations without abstractions, making it beginner-friendly while offering advanced features like flash attention and support for various precision levels. LitGPT is designed to run efficiently across multiple GPUs or TPUs, catering to both small-scale and large-scale deployments.

Downloads: 1 This Week

Last Update: 2025-12-18
See Project
2

Phi-3-MLX

Phi-3.5 for Mac: Locally-run Vision and Language Models

Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.

Downloads: 1 This Week

Last Update: 2025-03-13
See Project
3

OpenLLM

Operating LLMs in production

An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease. With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2， StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more. Serve LLMs over RESTful API or gRPC with one command, query via...

Downloads: 1 This Week

Last Update: 2025-04-21
See Project
4

SageMaker Python SDK

Training and deploying machine learning models on Amazon SageMaker

SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the SDK, you can train and deploy models using popular deep learning frameworks Apache MXNet and TensorFlow. You can also train and deploy models with Amazon algorithms, which are scalable implementations of core machine learning algorithms that are optimized for SageMaker and GPU training. If you have your own algorithms built into SageMaker-compatible Docker...

Downloads: 1 This Week

Last Update: 6 days ago
See Project
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
5

RamaLama

Simplifies the local serving of AI models from any source

RamaLama is an open-source developer tool that simplifies working with and serving AI models locally or in production by leveraging container technologies like Docker, Podman, and OCI registries, allowing AI inference workflows to be treated like standard container deployments. It abstracts away much of the complexity of configuring AI runtimes, dependencies, and hardware optimizations by detecting available GPUs (or falling back to CPU) and automatically pulling a container image...

Downloads: 1 This Week

Last Update: 7 days ago
See Project
6

Mosec

A high-performance ML model serving framework, offers dynamic batching

Mosec is a high-performance and flexible model-serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

Downloads: 1 This Week

Last Update: 2026-04-15
See Project
7

KServe

Standardized Serverless ML Inference Platform on Kubernetes

KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks. It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX. It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and...

Downloads: 1 This Week

Last Update: 2026-03-13
See Project
8

Pytorch-toolbelt

PyTorch extensions for fast R&D prototyping and Kaggle farming

A pytorch-toolbelt is a Python library with a set of bells and whistles for PyTorch for fast R&D prototyping and Kaggle farming. Easy model building using flexible encoder-decoder architecture. Modules: CoordConv, SCSE, Hypercolumn, Depthwise separable convolution and more. GPU-friendly test-time augmentation TTA for segmentation and classification. GPU-friendly inference on huge (5000x5000) images. Every-day common routines (fix/restore random seed, filesystem utils, metrics). Losses:...

Downloads: 1 This Week

Last Update: 2024-11-21
See Project
9

Pandas Profiling

Create HTML profiling reports from pandas DataFrame objects

pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase,...

Downloads: 1 This Week

Last Update: 23 hours ago
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
10

Ray

A unified framework for scalable computing

Modern workloads like deep learning and hyperparameter tuning are compute-intensive and require distributed or parallel execution. Ray makes it effortless to parallelize single machine code — go from a single CPU to multi-core, multi-GPU or multi-node with minimal code changes. Accelerate your PyTorch and Tensorflow workload with a more resource-efficient and flexible distributed execution framework powered by Ray. Accelerate your hyperparameter search workloads with Ray Tune. Find the best...

Downloads: 1 This Week

Last Update: 4 days ago
See Project
11

DeepSpeed

Deep learning optimization library: makes distributed training easy

DeepSpeed is an easy-to-use deep learning optimization software suite that enables unprecedented scale and speed for Deep Learning Training and Inference. With DeepSpeed you can: 1. Train/Inference dense or sparse models with billions or trillions of parameters 2. Achieve excellent system throughput and efficiently scale to thousands of GPUs 3. Train/Inference on resource constrained GPU systems 4. Achieve unprecedented low latency and high throughput for inference 5. Achieve extreme...

Downloads: 1 This Week

Last Update: 2026-03-30
See Project
12

SparseML

Libraries for applying sparsification recipes to neural networks

SparseML is an optimization toolkit for training and deploying deep learning models using sparsification techniques like pruning and quantization to improve efficiency.

Downloads: 0 This Week

Last Update: 2025-06-02
See Project
13

GPflow

Gaussian processes in TensorFlow

GPflow is a package for building Gaussian process models in Python. It implements modern Gaussian process inference for composable kernels and likelihoods. GPflow builds on TensorFlow 2.4+ and TensorFlow Probability for running computations, which allows fast execution on GPUs.

Downloads: 0 This Week

Last Update: 2025-05-29
See Project
14

NNCF

Neural Network Compression Framework for enhanced OpenVINO

NNCF (Neural Network Compression Framework) is an optimization toolkit for deep learning models, designed to apply quantization, pruning, and other techniques to improve inference efficiency.

Downloads: 0 This Week

Last Update: 2026-04-08
See Project
15

API-for-Open-LLM

Openai style api for open large language models

API-for-Open-LLM is a lightweight API server designed for deploying and serving open large language models (LLMs), offering a simple way to integrate LLMs into applications.

Downloads: 0 This Week

Last Update: 2025-01-22
See Project
16

DeepSparse

Sparsity-aware deep learning inference runtime for CPUs

A sparsity-aware enterprise inferencing system for AI models on CPUs. Maximize your CPU infrastructure with DeepSparse to run performant computer vision (CV), natural language processing (NLP), and large language models (LLMs).

Downloads: 0 This Week

Last Update: 2025-06-02
See Project
17

Text Generation Inference

Large Language Model Text Generation Inference

Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.

Downloads: 0 This Week

Last Update: 2025-12-18
See Project
18

LazyLLM

Easiest and laziest way for building multi-agent LLMs applications

LazyLLM is an optimized, lightweight LLM server designed for easy and fast deployment of large language models. It is fully compatible with the OpenAI API specification, enabling developers to integrate their own models into applications that normally rely on OpenAI’s endpoints. LazyLLM emphasizes low resource usage and fast inference while supporting multiple models.

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
19

SetFit

Efficient few-shot learning with Sentence Transformers

SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers. It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples.

Downloads: 0 This Week

Last Update: 2025-08-05
See Project
20

TorchAudio

Data manipulation and transformation for audio signal processing

The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...

Downloads: 0 This Week

Last Update: 2026-02-17
See Project
21

Adversarial Robustness Toolbox

Adversarial Robustness Toolbox (ART) - Python Library for ML security

Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to evaluate, defend, certify and verify Machine Learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference. ART supports all popular machine learning frameworks (TensorFlow, Keras, PyTorch, MXNet, sci-kit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.), all data types (images, tables, audio,...

Downloads: 0 This Week

Last Update: 2025-07-07
See Project
22

Adapters

A Unified Library for Parameter-Efficient Learning

Adapters is an add-on library to HuggingFace's Transformers, integrating 10+ adapter methods into 20+ state-of-the-art Transformer models with minimal coding overhead for training and inference. Adapters provide a unified interface for efficient fine-tuning and modular transfer learning, supporting a myriad of features like full-precision or quantized training (e.g. Q-LoRA, Q-Bottleneck Adapters, or Q-PrefixTuning), adapter merging via task arithmetics or the composition of multiple adapters...

Downloads: 0 This Week

Last Update: 2025-05-20
See Project
23

Llama Recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method

The 'llama-recipes' repository is a companion to the Meta Llama models. We support the latest version, Llama 3.1, in this repository. The goal is to provide a scalable library for fine-tuning Meta Llama models, along with some example scripts and notebooks to quickly get started with using the models in a variety of use-cases, including fine-tuning for domain adaptation and building LLM-based applications with Llama and other tools in the LLM ecosystem. The examples here showcase how to run...

Downloads: 0 This Week

Last Update: 2025-01-22
See Project
24

DoWhy

DoWhy is a Python library for causal inference

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks. Much like machine learning libraries have done for prediction, DoWhy is a Python library that aims to spark causal thinking and analysis. DoWhy provides a wide variety of algorithms for effect estimation, causal structure learning, diagnosis of causal...

Downloads: 0 This Week

Last Update: 2025-11-03
See Project
25

EconML

Python Package for ML-Based Heterogeneous Treatment Effects Estimation

EconML is a Python package for estimating heterogeneous treatment effects from observational data via machine learning. This package was designed and built as part of the ALICE project at Microsoft Research with the goal of combining state-of-the-art machine learning techniques with econometrics to bring automation to complex causal inference problems. One of the biggest promises of machine learning is to automate decision-making in a multitude of domains. At the core of many data-driven...

Downloads: 0 This Week

Last Update: 2025-07-10
See Project

Previous
1
You're on page 2
3
4
Next

Related Searches

phi

ray

smart home control panel

machine learning workloads

inference engine

Related Categories

Artificial Intelligence

Software Development

Business

System

Database

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise