gpu free download - SourceForge

64 projects for "gpu" with 2 filters applied:

Software Development BSD Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Automate contact and company data extraction
Build lead generation pipelines that pull emails, phone numbers, and company details from directories, maps, social platforms. Full API access.

Generate leads at scale without building or maintaining scrapers. Use 10,000+ ready-made tools that handle authentication, pagination, and anti-bot protection. Pull data from business directories, social profiles, and public sources, then export to your CRM or database via API. Schedule recurring extractions, enrich existing datasets, and integrate with your workflows.

Explore Apify Store
1

Khronos KTX

KTX (Khronos Texture) Library and Tools

KTX-Software is a suite of tools and libraries for working with Khronos Texture (KTX) files, designed and maintained by the Khronos Group. KTX is a container format for storing textures that are optimized for GPU upload, supporting modern formats like Basis Universal and ASTC. This repository includes tools for creating, validating, inspecting, and converting KTX and KTX2 files, making it essential for developers working in 3D engines, games, and visualization tools where texture streaming and compression are key.

Downloads: 35 This Week

Last Update: 2025-10-04
See Project
2

waifu2x ncnn Vulkan

waifu2x converter ncnn version, run fast GPU with vulkan

ncnn implementation of waifu2x converter. Runs fast on Intel/AMD/Nvidia/Apple-Silicon with Vulkan API. waifu2x-ncnn-vulkan uses ncnn project as the universal neural network inference framework.

Downloads: 7 This Week

Last Update: 2025-09-15
See Project
3

libplacebo

Official mirror of libplacebo

libplacebo is a flexible, high-performance graphics library built on top of Vulkan, designed to provide reusable GPU-accelerated components for media applications. It originated as a core part of the rendering pipeline for the mpv media player and has since grown into a standalone library used for tone mapping, dithering, color space conversion, and more. libplacebo is ideal for developers looking to integrate sophisticated video rendering and post-processing into their own applications with full control over shaders and rendering stages.

Downloads: 8 This Week

Last Update: 2025-05-21
See Project
4

PortableGL

An implementation of OpenGL 3.x-ish in clean C

PortableGL is a single-header, software-only implementation of a subset of OpenGL (specifically the GL 2.1 pipeline), designed to run entirely on the CPU. This lightweight graphics library allows OpenGL-style rendering without GPU acceleration, making it ideal for educational use, debugging, embedded systems, and retro-style software rendering. Because it mirrors OpenGL syntax and design, it can act as a drop-in CPU renderer for testing or deploying 3D graphics on platforms without GPU support.

Downloads: 4 This Week

Last Update: 2025-09-15
See Project
Lightspeed golf course management software
Lightspeed Golf is all-in-one golf course management software to help courses simplify operations, drive revenue and deliver amazing golf experiences.

From tee sheet management, point of sale and payment processing to marketing, automation, reporting and more—Lightspeed is built for the pro shop, restaurant, back office, beverage cart and beyond.

Learn More
5

DeepEP

DeepEP: an efficient expert-parallel communication library

DeepEP is a communication library designed specifically to support Mixture-of-Experts (MoE) and expert parallelism (EP) deployments. Its core role is to implement high-throughput, low-latency all-to-all GPU communication kernels, which handle the dispatching of tokens to different experts (or shards) and then combining expert outputs back into the main data flow. Because MoE architectures require routing inputs to different experts, communication overhead can become a bottleneck — DeepEP addresses that by providing optimized GPU kernels and efficient dispatch/combining logic. ...

Downloads: 0 This Week

Last Update: 2025-10-03
See Project
6

TorchQuantum

A PyTorch-based framework for Quantum Classical Simulation

...Researchers on quantum algorithm design, parameterized quantum circuit training, quantum optimal control, quantum machine learning, and quantum neural networks. Dynamic computation graph, automatic gradient computation, fast GPU support, batch model terrorized processing.

Downloads: 1 This Week

Last Update: 2024-09-30
See Project
7

Images.jl

An image library for Julia

...Julia is well-suited to image processing because it is a modern and elegant high-level language that is a pleasure to use, while also allowing you to write "inner loops" that compile to efficient machine code (i.e., it is as fast as C). Julia supports multithreading and, through add-on packages, GPU processing. JuliaImages is a collection of packages specifically focused on image processing. It is not yet as complete as some toolkits for other programming languages, but it has many useful algorithms. It is focused on clean architecture and is designed to unify "machine vision" and "biomedical 3d image processing" communities.

Downloads: 1 This Week

Last Update: 2025-01-21
See Project
8

XFrames

GPU-accelerated GUI development for Node.js and the browser

xframes is a high-performance library that empowers developers to build native desktop applications using familiar web technologies, specifically Node.js and React, without the overhead of the DOM. xframes serves as a streamlined alternative to Electron, designed for developers looking to maximize performance and efficiency.

Downloads: 5 This Week

Last Update: 2024-12-07
See Project
9

Kintsugi

A tool to automatically resolve Git conflicts

...Named after the Japanese art of repair and beauty, Kintsugi embraces imperfect captures and enhances them intelligently, preserving natural detail while reducing noise and artifacts in ways that align with human visual preferences. The toolkit includes both CPU and GPU paths, allowing it to scale from mobile devices to powerful workstations while maintaining real-time or near-real-time responsiveness for interactive editing contexts. Its algorithmic suite is designed to be modular as well, so developers can pick and combine components for tasks like RAW image enhancement, HDR tone management, or aesthetic adjustments with perceptual fidelity.

Downloads: 0 This Week

Last Update: 2026-01-07
See Project
eProcurement Software
Enterprises and companies seeking a solution to manage all their procurement operations and processes

eBuyerAssist by Eyvo is a cloud-based procurement solution designed for businesses of all sizes and industries. Fully modular and scalable, it streamlines the entire procurement lifecycle—from requisition to fulfillment. The platform includes powerful tools for strategic sourcing, supplier management, warehouse operations, and contract oversight. Additional modules cover purchase orders, approval workflows, inventory and asset management, customer orders, budget control, cost accounting, invoice matching, vendor credit checks, and risk analysis. eBuyerAssist centralizes all procurement functions into a single, easy-to-use system—improving visibility, control, and efficiency across your organization. Whether you're aiming to reduce costs, enhance compliance, or align procurement with broader business goals, eBuyerAssist helps you get there faster, smarter, and with measurable results.

Learn More
10

DualPipe

A bidirectional pipeline parallelism algorithm

DualPipe is a bidirectional pipeline parallelism algorithm open-sourced by DeepSeek, introduced in their DeepSeek-V3 technical framework. The main goal of DualPipe is to maximize overlap between computation and communication phases during distributed training, thus reducing idle GPU time (i.e. “pipeline bubbles”) and improving cluster efficiency. Traditional pipeline parallelism methods (e.g. 1F1B or staggered pipelining) leave gaps because forward and backward phases can’t fully overlap with communication. DualPipe addresses that by scheduling micro-batches from both ends of the pipeline in a bidirectional fashion—i.e. some micro-batches flow forward while others flow backward—so that computation on one partition can coincide with communication for another.

Downloads: 0 This Week

Last Update: 2025-12-25
See Project
11

webgl-plot

A high-Performance real-time 2D plotting library based on native WebGL

...Unlike traditional canvas or SVG-based charting libraries, webgl-plot is optimized for streaming and dynamic updates, making it ideal for oscilloscope-style data, biomedical signals, or any application where data updates hundreds of times per second. Its minimal memory footprint and GPU acceleration ensure excellent performance even with tens of thousands of data points, and its simple API allows developers to get started quickly.

Downloads: 0 This Week

Last Update: 2025-03-26
See Project
12

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

FlashMLA is a high-performance decoding kernel library designed especially for Multi-Head Latent Attention (MLA) workloads, targeting NVIDIA Hopper GPU architectures. It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style. The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. ...

Downloads: 1 This Week

Last Update: 2025-10-03
See Project
13

InvertibleNetworks.jl

A Julia framework for invertible neural networks

Building blocks for invertible neural networks in the Julia programming language.

Downloads: 1 This Week

Last Update: 2024-10-02
See Project
14

Theseus

A library for differentiable nonlinear optimization

...Because solves are differentiable, you can backpropagate through optimization to learn cost weights, feature extractors, or initialization networks end-to-end. The implementation supports batched optimization on GPU, robust losses, damping strategies, and custom factors, making it practical for real-time systems. Helper packages provide geometry primitives and utilities for composing priors, relative constraints, and measurement models. Theseus bridges the gap between classical optimization and deep learning, enabling hybrid systems that learn components.

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
15

Tracy Profiler

Frame profiler

...Tracy supports profiling CPU (Direct support is provided for C, C++, Lua and Python integration. At the same time, third-party bindings to many other languages exist on the internet, such as Rust, Zig, C#, OCaml, Odin, etc.), GPU (All major graphic APIs: OpenGL, Vulkan, Direct3D 11/12, OpenCL.), memory allocations, locks, context switches, automatically attribute screenshots to captured frames, and much more.

Downloads: 7 This Week

Last Update: 2025-12-11
See Project
16

GPUPixel

Real-time image and video processing library similar to GPUImage

GPUPixel is a real-time image and video processing library written in C++11, based on OpenGL/ES. It offers functionalities similar to GPUImage, including built-in beauty filters, enabling efficient processing and rendering of visual effects on images and videos.

Downloads: 5 This Week

Last Update: 2025-10-20
See Project
17

Floem

A native Rust UI library with fine-grained reactivity

Floem is a cross-platform GUI framework for Rust. It aims to be extremely performant while providing world-class developer ergonomics. Supporting both GPU and CPU rendering, Floem gives you performance that's closest to bare metal. Also primitives are provided to help the developer to write performant UI code without too much effect.

Downloads: 0 This Week

Last Update: 2024-11-15
See Project
18

FairChem

FAIR Chemistry's library of machine learning methods for chemistry

...Tasks span heterogeneous domains—catalysis (OC20-style), inorganic materials (OMat), molecules (OMol), MOFs (ODAC), and molecular crystals (OMC)—allowing one model family to serve many simulations. The README provides quick paths for pulling models (e.g., via Hugging Face access), then running energy/force predictions on GPU or CPU.

Downloads: 0 This Week

Last Update: 2025-12-11
See Project
19

PyOpenCL

OpenCL integration for Python, plus shiny features

PyOpenCL is a Python wrapper for the OpenCL framework, providing seamless access to parallel computing on CPUs, GPUs, and other accelerators. It enables developers to harness the full power of heterogeneous computing directly from Python, combining Python’s ease of use with the performance benefits of OpenCL. PyOpenCL also includes convenient features for managing memory, compiling kernels, and interfacing with NumPy, making it a preferred choice in scientific computing, data analysis, and...

Downloads: 4 This Week

Last Update: 2026-01-09
See Project
20

CGL

CGL (C Game Library) is a multipurpose library

...Designed for simplicity and portability, cgl allows rendering of primitives such as lines, circles, triangles, and text to an in-memory framebuffer, which can then be displayed with any platform-dependent backend. It’s ideal for building custom engines, retro-style games, GUIs, or educational demos where GPU acceleration is not required. Its small footprint and lack of external dependencies make it easy to embed in any C project.

Downloads: 4 This Week

Last Update: 2025-03-27
See Project
21

glsl-sandbox

Shader editor and gallery

...Because everything runs client-side, iteration is fast and portable—just load the page and start typing. It has become a staple tool in the creative-coding community, lowering the barrier to entry for shader art and GPU programming.

Downloads: 1 This Week

Last Update: 2025-10-24
See Project
22

Multimodal

TorchMultimodal is a PyTorch library

This project, also known as TorchMultimodal, is a PyTorch library for building, training, and experimenting with multimodal, multi-task models at scale. The library provides modular building blocks such as encoders, fusion modules, loss functions, and transformations that support combining modalities (vision, text, audio, etc.) in unified architectures. It includes a collection of ready model classes—like ALBEF, CLIP, BLIP-2, COCA, FLAVA, MDETR, and Omnivore—that serve as reference...

Downloads: 3 This Week

Last Update: 2026-01-12
See Project
23

ProbabilisticCircuits.jl

Probabilistic Circuits from the Juice library

This module provides a Julia implementation of Probabilistic Circuits (PCs), tools to learn structure and parameters of PCs from data, and tools to do tractable exact inference with them. Probabilistic Circuits provides a unifying framework for several family of tractable probabilistic models. PCs are represented as computational graphs that define a joint probability distribution as recursive mixtures (sum units) and factorizations (product units) of simpler distributions (input units)....

Downloads: 3 This Week

Last Update: 2024-06-10
See Project
24

Tunix

A JAX-native LLM Post-Training Library

Tunix is a JAX-native library for post-training large language models, bringing supervised fine-tuning, reinforcement learning–based alignment, and knowledge distillation into one coherent toolkit. It embraces JAX’s strengths—functional programming, jit compilation, and effortless multi-device execution—so experiments scale from a single GPU to pods of TPUs with minimal code changes. The library is organized around modular pipelines for data loading, rollout, optimization, and evaluation, letting practitioners swap components without rewriting the whole stack. Examples and reference configs demonstrate end-to-end runs for common model families, helping teams reproduce baselines before customizing. ...

Downloads: 2 This Week

Last Update: 2025-11-21
See Project
25

frugally-deep

A lightweight header-only library for using Keras (TensorFlow) models

...Avoids temporarily allocating (potentially large chunks of) additional RAM during convolutions (by not materializing the im2col input matrix). Utterly ignores even the most powerful GPU in your system and uses only one CPU core per prediction. Quite fast on one CPU core, and you can run multiple predictions in parallel, thus utilizing as many CPUs as you like to improve the overall prediction throughput of your application/pipeline.

Downloads: 2 This Week

Last Update: 2025-05-16
See Project