Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "gpu faster" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Linux 79
Windows 72
Mac 65
More...
BSD 20
ChromeOS 17
Mobile Operating Systems 4

Category

Artificial Intelligence 55
Software Development 18
Scientific/Engineering 6
Business 4
System 4
Games 2
Blockchain 1
Mobile 1
Multimedia 1
Terminals 1

License

OSI-Approved Open Source 69
Other License 1

Translations

Bengali 1
English 1

Programming Language

Python 46
C++ 16
C 5
JavaScript 2
More...
Julia 2
Rust 2
TypeScript 2
Unix Shell 2
Go 1

Status

Production/Stable 5
Beta 3
Alpha 1

Showing 79 open source projects for "gpu faster"

View related business solutions

Linux Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
8 Monitoring Tools in One APM. Install in 5 Minutes.
Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.

Start Free
1

AI-powered enterprise search engine

AI-powered enterprise search engine

AI-powered enterprise search engine is an open-source, AI-powered enterprise search engine designed to help organizations quickly locate and retrieve information scattered across multiple internal tools, documents, and communication platforms. It enables users to search across sources such as Slack, Confluence, Jira, Google Drive, and other enterprise systems, consolidating fragmented knowledge into a single, unified search experience. By leveraging natural language processing, Gerev allows...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
2

ChatLLM Web

Chat with LLM like Vicuna totally in your browser with WebGPU

...Powered By web-llm. To use this app, you need a browser that supports WebGPU, such as Chrome 113 or Chrome Canary. Chrome versions ≤ 112 are not supported. You will need a GPU with about 6.4GB of memory. If your GPU has less memory, the app will still run, but the response time will be slower. The first time you use the app, you will need to download the model. For the Vicuna-7b model that we are currently using, the download size is about 4GB. After the initial download, the model will be loaded from the browser cache for faster usage.

Downloads: 0 This Week

Last Update: 2023-08-25
See Project
3

FFCV

Fast Forward Computer Vision (and other ML workloads!)

ffcv is a drop-in data loading system that dramatically increases data throughput in model training. From gridding to benchmarking to fast research iteration, there are many reasons to want faster model training. Below we present premade codebases for training on ImageNet and CIFAR, including both (a) extensible codebases and (b) numerous premade training configurations.

Downloads: 0 This Week

Last Update: 2024-08-07
See Project
4

Point-E

Point cloud diffusion for 3D model synthesis

point-e is the official repository for Point-E, a generative model developed by OpenAI that produces 3D point clouds from textual (or image) prompts. Its principal advantage is speed: it can generate 3D assets in just 1–2 minutes on a single GPU, which is significantly faster than many competing text-to-3D models. The model works via a two-stage diffusion approach: first, it uses a text → image diffusion network to produce a synthetic 2D view consistent with the prompt; then a second diffusion model converts that image into a 3D point cloud. While it does not match the fine detail of some slower methods, the tradeoff in speed makes it practical for prototyping and interactive 3D generation. ...

Downloads: 0 This Week

Last Update: 2025-10-02
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
5

G-Diffuser Bot

Discord bot and Interface for Stable Diffusion

The first release of the all-in-one installer version of G-Diffuser is here. This release no longer requires the installation of WSL or Docker and has a systray icon to keep track of and launch G-Diffuser components. The infinite zoom scripts have been updated with some improvements, notably a new compositer script that is hundreds of times faster than before. The first release of the all-in-one installer is here. It notably features much easier "one-click" installation and updating, as well...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
6

Fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the...

Downloads: 1 This Week

Last Update: 2022-06-27
See Project
7

Tensorflow Transformers

State of the art faster Transformer with Tensorflow 2.0

...Images, for tasks like image classification, object detection, and segmentation. Audio, for tasks like speech recognition and audio classification. Faster AutoReggressive Decoding, TFlite support, creating TFRecords is simple. Auto-Batching tf.data.dataset or tf.ragged tensors. Everything is dictionary (inputs and outputs) Multiple mask modes like causal, user-defined, prefix. tensorflow-text tokenizer support. Supports GPU, TPU, multi-GPU trainer with wandb, multiple callbacks, auto tensorboard.

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
8

Guild AI

Experiment tracking, ML developer tools

Guild AI is an open-source experiment tracking toolkit designed to bring systematic control to machine learning workflows, enabling users to build better models faster. It automatically captures every detail of training runs as unique experiments, facilitating comprehensive tracking and analysis. Users can compare and analyze runs to deepen their understanding and incrementally improve models. Guild AI simplifies hyperparameter tuning by applying state-of-the-art algorithms through...

Downloads: 0 This Week

Last Update: 2024-11-13
See Project
9

TensorFlowOnSpark

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters

By combining salient features from the TensorFlow deep learning framework with Apache Spark and Apache Hadoop, TensorFlowOnSpark enables distributed deep learning on a cluster of GPU and CPU servers. It enables both distributed TensorFlow training and inferencing on Spark clusters, with a goal to minimize the amount of code changes required to run existing TensorFlow programs on a shared grid.

Downloads: 0 This Week

Last Update: 2024-08-05
See Project
Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
10

Hugging Face Transformer

CPU/GPU inference server for Hugging Face transformer models

...Both are great tools but not very performant in inference. Then, if you spend some time, you can build something over ONNX Runtime and Triton inference server. You will usually get from 2X to 4X faster inference compared to vanilla Pytorch. It's cool! However, if you want the best in class performances on GPU, there is only a single possible combination: Nvidia TensorRT and Triton. You will usually get 5X faster inference compared to vanilla Pytorch.

Downloads: 1 This Week

Last Update: 2022-08-22
See Project
11

TSNE-CUDA

GPU Accelerated t-SNE for CUDA with Python bindings

This repo is an optimized CUDA version of FIt-SNE algorithm with associated python modules. We find that our implementation of t-SNE can be up to 1200x faster than Sklearn, or up to 50x faster than Multicore-TSNE when used with the right GPU. You can install binaries with anaconda for CUDA version 10.1 and 10.2 using conda install tsnecuda -c conda-forge. Tsnecuda supports CUDA versions 9.0 and later through source installation, check out the wiki for up to date installation instructions. ...

Downloads: 0 This Week

Last Update: 2022-07-14
See Project
12

MACE

Deep learning inference framework optimized for mobile platforms

...Runtime is optimized with NEON, OpenCL and Hexagon, and Winograd algorithm is introduced to speed up convolution operations. The initialization is also optimized to be faster. Chip-dependent power options like big.LITTLE scheduling, Adreno GPU hints are included as advanced APIs. UI responsiveness guarantee is sometimes obligatory when running a model. Mechanism like automatically breaking OpenCL kernel into small units is introduced to allow better preemption for the UI rendering task. Graph level memory allocation optimization and buffer reuse are supported. ...

Downloads: 0 This Week

Last Update: 2022-01-13
See Project
13

Detectron2

Next-generation platform for object detection and segmentation

...Includes more features such as panoptic segmentation, Densepose, Cascade R-CNN, rotated bounding boxes, PointRend, DeepLab, etc. Can be used as a library to support different projects on top of it. We'll open source more research projects in this way. It trains much faster. Models can be exported to TorchScript format or Caffe2 format for deployment. With a new, more modular design, Detectron2 is flexible and extensible, and able to provide fast training on single or multiple GPU servers. Detectron2 includes high-quality implementations of state-of-the-art object detection.

Downloads: 0 This Week

Last Update: 2021-10-26
See Project
14

Tez

Tez is a super-simple and lightweight Trainer for PyTorch

Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch. tez (तेज़ / تیز) means sharp, fast & active. This is a simple, to-the-point, library to make your PyTorch training easy. This library is in early-stage currently! So, there might be breaking changes. Currently, tez supports cpu, single gpu and multi-gpu & tpu training. More coming soon! Using tez is super-easy. We don't want you to...

Downloads: 0 This Week

Last Update: 2022-08-19
See Project
15

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

...Darknet on the CPU is fast (approximately 1.5 seconds on an Intel Core i7-6700HQ CPU @ 2.60GHz × 8) but it's like 500 times faster on GPU! You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
16

HiFi-GAN

Generative Adversarial Networks for Efficient and High Fidelity Speech

...It introduces a generator architecture tailored to model the periodic structure of speech and a set of discriminators that focus on different scales and periods of the waveform to better capture naturalness. The model targets a sweet spot between sample quality and generation speed, outperforming many previous GAN vocoders while being far faster than typical autoregressive models. In experiments on LJSpeech, HiFi-GAN was shown to achieve mean opinion scores close to human recordings while synthesizing 22.05 kHz audio up to ~168× faster than real time on an NVIDIA V100 GPU. A smaller configuration trades a bit of quality for even higher speed and can run more than 13× faster than real time on CPU, making it suitable for deployment scenarios without powerful GPUs.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
17

PyText

A natural language modeling framework based on PyTorch

...We use PyText at Facebook to iterate quickly on new modeling ideas and then seamlessly ship them at scale. Distributed-training support built on the new C10d backend in PyTorch 1.0. Mixed precision training support through APEX (trains faster with less GPU memory on NVIDIA Tensor Cores). Extensible components that allows easy creation of new models and tasks.

Downloads: 0 This Week

Last Update: 2021-08-31
See Project
18

Bangla TTS

Bangla text to speech synthesis in python

Bangla text to speech Multilingual (Bangla, English) real-time ([almost] in a GPU) speech synthesis library. Installation -------------------------------------- * Install Anaconda * conda create -n new_virtual_env python==3.6.8 * conda activate new_virtual_env * pip install -r requirements.txt * While running for the first time, keep your internet connection on to download the weights of the speech synthesis models (>500 MB) * For...

Downloads: 1 This Week

Last Update: 2020-09-03
See Project
19

textgenrnn

Easily train your own text-generating neural network

...Train on and generate text at either the character-level or word-level. Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs. Train on any generic input text file, including large files. Train models on a GPU and then use them to generate text with a CPU. Utilize a powerful CuDNN implementation of RNNs when trained on the GPU, which massively speeds up training time as opposed to typical LSTM implementations. Train the model using contextual labels, allowing it to learn faster and produce better results in some cases.

Downloads: 0 This Week

Last Update: 2021-11-24
See Project
20

maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation

Mask R-CNN Benchmark is a PyTorch-based framework that provides high-performance implementations of object detection, instance segmentation, and keypoint detection models. Originally built to benchmark Mask R-CNN and related models, it offers a clean, modular design to train and evaluate detection systems efficiently on standard datasets like COCO. The framework integrates critical components—region proposal networks (RPNs), RoIAlign layers, mask heads, and backbone architectures such as...

Downloads: 0 This Week

Last Update: 2025-10-06
See Project
21

Tensorpack

A Neural Net Training Interface on TensorFlow, with focus on speed

Tensorpack is a neural network training interface based on TensorFlow v1. Uses TensorFlow in the efficient way with no extra overhead. On common CNNs, it runs training 1.2~5x faster than the equivalent Keras code. Your training can probably gets faster if written with Tensorpack. Scalable data-parallel multi-GPU / distributed training strategy is off-the-shelf to use. Squeeze the best data loading performance of Python with tensorpack.dataflow. Symbolic programming (e.g. tf.data) does not offer the data processing flexibility needed in research. ...

Downloads: 0 This Week

Last Update: 2022-08-01
See Project
22

LUMINOTH

Deep Learning toolkit for Computer Vision

LUMINOTH is an open-source deep learning toolkit designed for computer vision tasks, particularly object detection. The framework is implemented in Python and built on top of TensorFlow and the Sonnet neural network library, providing a modular environment for training and deploying detection models. It was created to simplify the process of building and experimenting with deep learning models capable of identifying objects within images. Luminoth includes support for popular object...

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
23

RavenCoin Wallet

RavenCoin Wallet including CPU and GPU miners!

Raven is an experimental digital currency that enables instant payments to anyone, anywhere in the world. Raven uses peer-to-peer technology to operate with no central authority: managing transactions and issuing money are carried out collectively by the network. Raven Core is the name of open source software that enables the use of this currency. A digital peer-to-peer network for the facilitation of asset transfer. In the fictional world of Westeros, ravens are used as messengers who carry...

Downloads: 1 This Week

Last Update: 2022-01-24
See Project
24

SoAx

Structure of Arrays of multiple types

Structures of arrays (SoA) are generally faster than arrays of structures (AoS) while AoS are more handy. This project (SoAx) combines the advantages of both. By means of C++(11) meta-template programming SoAx achieves maximal performance (efficient use of vector units and cache of modern CPUs) while providing a very convenient user interface (including object-oriented element handling) and flexibility. It has been designed to handle list-like sets of particles (similar to struct {int id;...

Downloads: 0 This Week

Last Update: 2017-10-19
See Project
25

CUDA-Quicksort

CUDA-Quicksort: A GPU-based implementation of the quicksort algorithm

...CUDA-quicksort is an iterative GPU-based implementation of the quicksort algorithm. "Experiments performed on six sorting benchmark distributions show that CUDA-quicksort is up to four times faster than GPU-quicksort and up to three times faster than CDP-quicksort."[*]. *Copyright © 2015 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. (2015) DOI: 10.1002/cpe.3611 For further information, please see the corresponding publication: http://onlinelibrary.wiley.com/doi/10.1002/cpe.3611/abstract

Downloads: 0 This Week

Last Update: 2016-03-25
See Project

Previous
1
2
You're on page 3
4
Next

Related Searches

g-bot

windows optimizer

ai

artificial intelligence projects in php

yolo

bangla text to speech

money transfer

cuda benchmark

stable diffusion

foe script bot

Related Categories

Artificial Intelligence

Software Development

Scientific/Engineering

Business

System

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise