- automated use of local/shared memory with ulocal() annotation
- using .NET arrays in device programs efficiently with libgpuvm
- GPU programs spanning multiple assemblies, including dynamically loaded ones
- reduction in GPU loops with nured() annotation
- ... and other features
- non-GPU devices supported
- user-defined operators on device
- overloaded functions on device
- images supported
- teams of threads with global sync on device
- ... and other features
- gws % lws != 0 handled correctly
- tuples supported in device code
- continue inside nuwork() loops
- return and return <expr> in device code
- map() data performance primitive
- other changes and samples
- performance info and annotations added to some samples
- improved performance of many samples
- printing program binaries
- nucopy copying between NUDA arrays
new features:
- pointer arithmetics and conversion in NUDA kernels
- OpenCL profiling of NUDA kernels and arrays
- varies() macro for generating variants of code
- peel() annotation for peeling nfor() loops
NUDA project alpha, version 0.0.1 is available.
Source code: https://sourceforge.net/projects/nuda/files/nuda.tar.gz/download
Documentation: https://sourceforge.net/projects/nuda/files/extran-guide.pdf/download
Features:
- programming GPUs with subset of Nemerle language (http://nemerle.org)
- support for OpenCL GPUs (includes both NVidia and AMD)
- annotations for sending loops to GPU
- GPU-side arrays
- support for multiple GPUs
- automatic command-line arguments (see documentation)
- a number of loop-transforming annotations (see documentation)
NUDA (= Nemerle Unified Device Architecture) is a set of extensions for Nemerle programming language to facilitate GPU programming and writing HPC applications. Its main purpose is to experiment with extensible languages for HPC applications.
Currently, a number of features are supported. You can write GPU code in a subset of Nemerle language. You can send a loop to a GPU with a single annotation. The GPU code you write gets translated into OpenCL, so both NVidia and AMD GPUs are supported. GPU-side arrays with garbage collection are supported, as are multiple GPUs. And there's also a couple of other useful features, such as automatic command-line arguments. Try and enjoy!