cudnn free download - SourceForge

Torch-TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Torch-TensorRT is a compiler for PyTorch/TorchScript, targeting NVIDIA GPUs via NVIDIA’s TensorRT Deep Learning Optimizer and Runtime. Unlike PyTorch’s Just-In-Time (JIT) compiler, Torch-TensorRT is an Ahead-of-Time (AOT) compiler, meaning that before you deploy your TorchScript code, you go through an explicit compile step to convert a standard TorchScript program into a module targeting a TensorRT engine. Torch-TensorRT operates as a PyTorch extension and compiles modules that integrate...

Downloads: 12 This Week

Last Update: 2026-06-09

See Project

CUTLASS

CUDA Templates for Linear Algebra Subroutines

CUTLASS is a collection of CUDA C++ template abstractions for implementing high-performance matrix-multiplication (GEMM) and related computations at all levels and scales within CUDA. It incorporates strategies for hierarchical decomposition and data movement similar to those used to implement cuBLAS and cuDNN. CUTLASS decomposes these "moving parts" into reusable, modular software components abstracted by C++ template classes. These thread-wide, warp-wide, block-wide, and device-wide primitives can be specialized and tuned via custom tiling sizes, data types, and other algorithmic policy. The resulting flexibility simplifies their use as building blocks within custom kernels and applications. ...

Downloads: 2 This Week

Last Update: 1 day ago

See Project

SRU

Training RNNs as Fast as CNNs

...SRU is designed to provide expressive recurrence, enable highly parallelized implementation, and comes with careful initialization to facilitate the training of deep models. We demonstrate the effectiveness of SRU on multiple NLP tasks. SRU achieves 5--9x speed-up over cuDNN-optimized LSTM on classification and question answering datasets, and delivers stronger results than LSTM and convolutional models. We also obtain an average of 0.7 BLEU improvement over the Transformer model on the translation by incorporating SRU into the architecture. The experimental code and SRU++ implementation are available on the dev branch which will be merged into master later.

Downloads: 0 This Week

Last Update: 2022-08-09

See Project

textgenrnn

Easily train your own text-generating neural network

...Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs. Train on any generic input text file, including large files. Train models on a GPU and then use them to generate text with a CPU. Utilize a powerful CuDNN implementation of RNNs when trained on the GPU, which massively speeds up training time as opposed to typical LSTM implementations. Train the model using contextual labels, allowing it to learn faster and produce better results in some cases.

Downloads: 0 This Week

Last Update: 2021-11-24

See Project

Deepo

Set up deep learning environment in a single command line

Deepo is a series of Docker images that allows you to quickly set up your deep learning research environment, supports almost all commonly used deep learning frameworks, supports GPU acceleration (CUDA and cuDNN included), also works in CPU-only mode, and works on Linux (CPU version/GPU version), Windows (CPU version) and OS X (CPU version). Their Dockerfile generator that allows you to customize your own environment with Lego-like modules, and automatically resolves the dependencies for you. For users in China who may suffer from slow speeds when pulling the image from the public Docker registry, you can pull deepo images from the China registry mirror by specifying the full path, including the registry, in your docker pull command. ...

Downloads: 1 This Week

Last Update: 2021-09-08

See Project

Deep Learning with Keras and Tensorflow

Introduction to Deep Neural Networks with Keras and Tensorflow

...To date tensorflow comes in two different packages, namely tensorflow and tensorflow-gpu, whether you want to install the framework with CPU-only or GPU support, respectively. NVIDIA Drivers and CuDNN must be installed and configured before hand. Please refer to the official Tensorflow documentation for further details. Since version 0.9 Theano introduced the libgpuarray in the stable release (it was previously only available in the development version). The goal of libgpuarray is (from the documentation) make a common GPU ndarray (n dimensions array) that can be reused by all projects that is as future proof as possible, while keeping it easy to use for simple need/quick test. ...

Downloads: 0 This Week

Last Update: 2022-08-04

See Project

Search Results for "cudnn"

Showing 6 open source projects for "cudnn"

Torch-TensorRT

CUTLASS

SRU

textgenrnn

Deepo

Deep Learning with Keras and Tensorflow

Search Results for "cudnn"

Showing 6 open source projects for "cudnn"

Torch-TensorRT

CUTLASS

SRU

textgenrnn

Deepo

Deep Learning with Keras and Tensorflow

Related Searches

Related Categories