Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "nvidia gpu mod" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Linux 155
Windows 133
Mac 121
More...
BSD 31
ChromeOS 27
Mobile Operating Systems 4
Desktop Operating Systems 1

Category

Artificial Intelligence 59
Software Development 34
System 22
Scientific/Engineering 15
Multimedia 13
Blockchain 11
Business 9
Games 7
Security 5
Database 1
Education 1

License

OSI-Approved Open Source 129
Other License 3
Creative Commons Attribution License 2

Translations

English 4
Bengali 1
Polish 1

Programming Language

Python 60
C++ 39
C 19
Go 5
More...
JavaScript 5
Julia 5
Rust 5
C# 2
GLSL (OpenGL Shading Language) 2
Kotlin 1
Lua 1
MATLAB 1
TypeScript 1
Unix Shell 1

Status

Beta 12
Production/Stable 10
Planning 2
Pre-Alpha 2
More...
Alpha 1
Mature 1

Showing 155 open source projects for "nvidia gpu mod"

View related business solutions

Linux Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

CUDA.jl

CUDA programming in Julia

High-performance GPU programming in a high-level language. JuliaGPU is a GitHub organization created to unify the many packages for programming GPUs in Julia. With its high-level syntax and flexible compiler, Julia is well-positioned to productively program hardware accelerators like GPUs without sacrificing performance. The latest development version of CUDA.jl requires Julia 1.8 or higher. If you are using an older version of Julia, you need to use a previous version of CUDA.jl. This will...

Downloads: 6 This Week

Last Update: 2026-04-22
See Project
2

Transformer Engine

A library for accelerating Transformer models on NVIDIA GPUs

Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference. TE provides a collection of highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that can be used seamlessly with your framework-specific code. TE also includes a framework-agnostic C++...

Downloads: 4 This Week

Last Update: 5 days ago
See Project
3

waifu2x ncnn Vulkan

waifu2x converter ncnn version, run fast GPU with vulkan

ncnn implementation of waifu2x converter. Runs fast on Intel/AMD/Nvidia/Apple-Silicon with Vulkan API. waifu2x-ncnn-vulkan uses ncnn project as the universal neural network inference framework.

Downloads: 51 This Week

Last Update: 2025-09-15
See Project
4

NVTOP

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel

NVTOP stands for Neat Video card TOP, a (h) top-like task monitor for GPUs and accelerators. It can handle multiple GPUs and print information about them in a htop-familiar way. Currently supported vendors are AMD (Linux AMD GPU driver), Apple (limited M1 & M2 support), Huawei (Ascend), Intel (Linux i915 driver), NVIDIA (Linux proprietary divers), and Qualcomm Adreno (Linux MSM driver).

Downloads: 4 This Week

Last Update: 2026-02-08
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
5

clone-voice

A sound cloning tool with a web interface, using your voice

...The app is designed to be very easy to use: you download a precompiled package, double-click app.exe, and it launches a browser-based web interface where you control cloning and synthesis. It does not require an NVIDIA GPU to run basic tasks, although GPU acceleration can be used when available, making it accessible on modest machines. The tool supports around sixteen languages, including Chinese, English, Japanese, Korean, French, German, Italian, and others, and can capture reference voices directly from a microphone or from uploaded audio.

Downloads: 12 This Week

Last Update: 2025-11-28
See Project
6

XMRig

RandomX, KawPow, CryptoNight, AstroBWT and GhostRider unified miner

High performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT CPU/GPU miner, RandomX benchmark, and stratum proxy. XMRig is a high-performance, open-source, cross-platform RandomX, KawPow, CryptoNight, and AstroBWT unified CPU/GPU miner and RandomX benchmark. Official binaries are available for Windows, Linux, macOS, and FreeBSD. The preferred way to configure the miner is the JSON config file as it is more flexible and human-friendly. The command-line interface...

1 Review

Downloads: 25 This Week

Last Update: 2026-03-28
See Project
7

FastKoko

Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model

FastKoko is a self-hosted text-to-speech server built around the Kokoro-82M model and exposed through a FastAPI backend. It is designed to be easy to deploy via Docker, with separate CPU and GPU images so that users can choose between pure CPU inference and NVIDIA GPU acceleration. The project exposes an OpenAI-compatible speech endpoint, which means existing code that talks to the OpenAI audio API can often be pointed at a Kokoro-FastAPI instance with minimal changes. It supports multiple languages and voicepacks and allows phoneme based generation for more accurate pronunciation and prosody. ...

Downloads: 4 This Week

Last Update: 2025-12-13
See Project
8

Niri

A scrollable-tiling Wayland compositor

Niri is a dynamic, scrollable-tiling Wayland compositor built for Linux composed around columns laid out infinitely to the right. It supports multi-monitor setups, fractional scaling, floating windows, NVIDIA drivers, and input devices like tablets and touchpads. Stable for daily usage, many users have adopted it as their primary Wayland environment.

Downloads: 2 This Week

Last Update: 5 days ago
See Project
9

CuPy

A NumPy-compatible array library accelerated by CUDA

CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. ...

Downloads: 4 This Week

Last Update: 2026-02-20
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

CV-CUDA

CV-CUDA™ is an open-source, GPU accelerated library

CV-CUDA is an open-source project that enables building efficient cloud-scale Artificial Intelligence (AI) imaging and computer vision (CV) applications. It uses graphics processing unit (GPU) acceleration to help developers build highly efficient pre- and post-processing pipelines. CV-CUDA originated as a collaborative effort between NVIDIA and ByteDance.

Downloads: 0 This Week

Last Update: 2025-11-15
See Project
11

Humanoid-Gym

Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real

Humanoid-Gym is a reinforcement learning framework designed to train locomotion and control policies for humanoid robots using high-performance simulation environments. The system is built on top of NVIDIA Isaac Gym, which allows large-scale parallel simulation of robotic environments directly on GPU hardware. Its primary goal is to enable efficient training of humanoid robots in simulation while enabling policies to transfer effectively to real-world hardware without additional training. The framework emphasizes the concept of zero-shot sim-to-real transfer, meaning that behaviors learned in simulation can be deployed directly on physical robots with minimal adjustment. ...

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
12

Fabulously Optimized

A simple Minecraft modpack focusing on performance and graphics

A simple Minecraft modpack focusing on performance and graphics enhancements. A Minecraft modpack focused on performance, providing a smooth experience with multiple optimization mods.

Downloads: 31 This Week

Last Update: 5 days ago
See Project
13

FlashAttention

Fast and memory-efficient exact attention

...It achieves this by using IO-aware algorithms that minimize memory reads and writes, reducing the quadratic memory overhead typically associated with attention operations. The project provides implementations of FlashAttention, FlashAttention-2, and newer iterations optimized for modern GPU architectures such as NVIDIA Hopper and AMD accelerators. By improving both forward and backward pass efficiency, it enables training and inference of large language models with longer sequence lengths and higher throughput. The library integrates with PyTorch and supports various attention configurations, including causal masking, multi-query attention, and rotary embeddings.

Downloads: 40 This Week

Last Update: 2026-03-18
See Project
14

Isaac ROS Visual SLAM

Visual SLAM/odometry package based on NVIDIA-accelerated cuVSLAM

Discover a faster, easier way to build advanced AI robotics applications with the NVIDIA Isaac™ ROS collection of accelerated computing packages and AI models, bringing NVIDIA acceleration to ROS developers everywhere. Isaac ROS Visual SLAM provides a high-performance, best-in-class ROS 2 package for VSLAM (visual simultaneous localization and mapping). This package uses one or more stereo cameras and optionally an IMU to estimate odometry as an input to navigation.

Downloads: 2 This Week

Last Update: 2026-03-24
See Project
15

CUDA Containers for Edge AI & Robotics

Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

...The project is particularly useful for developers building edge AI and robotics systems that rely on GPU-accelerated inference and real-time computer vision. By using containerized environments, developers can ensure that their applications run consistently across different Jetson platforms and JetPack versions. The repository also includes build tools and package management utilities that help automate the process of assembling machine learning environments.

Downloads: 0 This Week

Last Update: 2026-04-23
See Project
16

CUDA Python

Performance meets Productivity

CUDA Python is a unified Python interface for accessing and working with the NVIDIA CUDA platform, enabling developers to build GPU-accelerated applications entirely in Python. It acts as a metapackage composed of multiple submodules that provide both high-level and low-level access to CUDA functionality, including runtime APIs, driver APIs, and JIT compilation tools. The project is designed to simplify GPU programming by offering Pythonic abstractions while still exposing the full power of CUDA for advanced users. ...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
17

OptiScaler

OptiScaler bridges upscaling/frame gen across GPUs

...Instead of relying on the upscaling method originally integrated by a game developer, the software intercepts the game’s rendering pipeline and redirects it to alternative technologies chosen by the user. This makes it possible to swap technologies such as NVIDIA DLSS, AMD FSR, or Intel XeSS even if the game only supports one of them by default. The tool effectively acts as a compatibility layer between the game engine and multiple upscaling frameworks, enabling cross-GPU access to features that might otherwise be restricted to specific hardware ecosystems. In addition to replacing upscalers, OptiScaler can enable frame generation features in titles that do not officially support them, improving frame rates and perceived smoothness during gameplay.

Downloads: 155 This Week

Last Update: 2 days ago
See Project
18

Megatron-LM

Ongoing research training transformer models at scale

Megatron-LM is a GPU-optimized deep learning framework from NVIDIA designed to train extremely large transformer-based language models efficiently at scale. The repository provides both a reference training implementation and Megatron Core, a composable library of high-performance building blocks for custom large-model pipelines. It supports advanced parallelism strategies including tensor, pipeline, data, expert, and context parallelism, enabling training across massive multi-GPU and multi-node clusters. ...

Downloads: 1 This Week

Last Update: 2026-04-22
See Project
19

ParallelStencil.jl

Package for writing high-level code for parallel stencil computations

ParallelStencil empowers domain scientists to write architecture-agnostic high-level code for parallel high-performance stencil computations on GPUs and CPUs. Performance similar to CUDA C / HIP can be achieved, which is typically a large improvement over the performance reached when using only CUDA.jl or AMDGPU.jl GPU Array programming. For example, a 2-D shallow ice solver presented at JuliaCon 2020 [1] achieved a nearly 20 times better performance than a corresponding GPU Array programming implementation; in absolute terms, it reached 70% of the theoretical upper performance bound of the used Nvidia P100 GPU, as defined by the effective throughput metric, T_eff. ...

Downloads: 0 This Week

Last Update: 2026-03-02
See Project
20

DALI

A GPU-accelerated library containing highly optimized building blocks

The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks.

Downloads: 3 This Week

Last Update: 2026-04-16
See Project
21

TensorRT

C++ library for high performance inference on NVIDIA GPUs

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...

Downloads: 8 This Week

Last Update: 2026-03-25
See Project
22

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

FlashMLA is a high-performance decoding kernel library designed especially for Multi-Head Latent Attention (MLA) workloads, targeting NVIDIA Hopper GPU architectures. It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style. The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. ...

Downloads: 0 This Week

Last Update: 2026-03-31
See Project
23

cuDF

GPU DataFrame Library

...The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Downloads: 0 This Week

Last Update: 2026-04-08
See Project
24

nvgpu-smi-snmp

SNMP agent for NVIDIA GPU data retrieved from the binary nvidia-smi

This is a fork of the project https://github.com/marwan-abdellah/nvgpu-snmp which instead of using the NV-CONTROL X extension to retrieve data make a call to the nVidia binary nvidia-smi. The biggest advantage of this approach is that the snmp daemon does not need access to the X display.

3 Reviews

Downloads: 1 This Week

Last Update: 2025-07-21
See Project
25

exo

Run your own AI cluster at home with everyday devices

Run your own AI cluster at home with everyday devices. Maintained by exo labs. Forget expensive NVIDIA GPUs, unify your existing devices into one powerful GPU, iPhone, iPad, Android, Mac, Linux, or pretty much any device. Now the default models, run 8B, 70B, and 405B parameter models on your own devices.

Downloads: 6 This Week

Last Update: 7 days ago
See Project

Previous
1
You're on page 2
3
4
5
6
7
Next

Related Searches

voice cloning

miner

cpu miner android

cpu miner 32bit

nvidia

cuda

cuda machine learning

amd

video card

xmrig

Related Categories

Artificial Intelligence

Software Development

System

Scientific/Engineering

Multimedia

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise