High performance image processing library in C++
The Simd Library is a free open source image processing library, designed for C and C++ programmers. It provides many useful high performance algorithms for image processing such as: pixel format conversion, image scaling and filtration, extraction of statistic information from images, motion detection, object detection (HAAR and LBP classifier cascades) and classification, neural network. The algorithms are optimized with using of different SIMD CPU extensions. In particular the library...
SIMD-accelerated libjpeg-compatible JPEG codec library
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression on x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is generally 2-6x as fast as libjpeg, all else being equal. On other types of systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by virtue of its highly-optimized Huffman coding routines. In many cases, the performance of libjpeg-turbo rivals...
Fast C++ library for linear algebra (matrix maths) and scientific computing. Easy to use functions and syntax, deliberately similar to Matlab. Uses template meta-programming techniques. Also provides efficient wrappers for LAPACK, BLAS, ATLAS, ARPACK and SuperLU libraries, including high-performance versions such as OpenBLAS and Intel MKL. Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. For more details, see...
An open source software-defined GNSS receiver
An open source software-defined Global Navigation Satellite Systems (GNSS) receiver written in C++ and based on the GNU Radio framework.
SIMD macro assembler unified for ARM, MIPS, PPC and x86
UniSIMD assembler is a high-level C/C++ macro assembler framework unified across ARM, MIPS, Power and x86 architectures. It establishes a subset of both BASE and SIMD instruction sets with clearly defined common API, so that application logic can be written and maintained in one place without code replication. The assembler itself isn't a separate tool, but rather a collection of C/C++ header files, which applications need to include directly in order to use. At present, Intel SSE/SSE2/SSE4...
Realtime raytracer using SIMD on ARM, MIPS, PPC and x86
QuadRay engine is a realtime raytracing project aimed at full SIMD utilization on ARM, MIPS, Power and x86 architectures. The efficient use of SIMD is achieved by processing four rays at a time to match SIMD register width (hence the name). The rendering core of the engine is written on a unified SIMD assembler allowing single assembler code to be compatible with different processor architectures, thus reducing the need to maintain multiple parallel versions. At present, Intel SSE/SSE2/SSE4...
Vector Pascal is a language targeted at SIMD multi-core instruction-sets such as the AVX and SSE2 or Xeon-Phi. It has a SIMD compiler which supports parallel vector operations, loop unrolling, common sub expression removal etc. It is implemented in Java.
SLEEF stands for SIMD Library for Evaluating Elementary Functions. SLEEF implements vectorized versions of all C99 math functions, that utilize SIMD instructions of modern processors to make computation more efficient. The library also includes vectorized DFT subroutines.
Collection of fast and optimized assembly libraries for x86-64 Linux
LinAsm is collection of very fast and SIMD optimized assembly written libraries for x86-64 Linux. It implements many common and widely used algorithms for array manipulations: searching, sorting, arithmetic and vector operations, unit conversions; fast mathematical and statistic functions; numbers and time converting algorithms; finite impulse response (FIR) digital filters; spectrum analysis algorithms, Fast Hartley transformation; CPU cache friendly functions and extremely fast abstract data...
PPface is vector processor emulator / simulator
PPface is vector processor emulator / simulator (SIMD array processor with 1-bit processing elements). VEPRAN language used to design parallel algorithms and operate data slices. The system allows to visualize algorithm work by viewing vector memory and registers, it supports debugging and animated program execution. To run this app on 64-bit system please install Windows Virtual PC and Windows XP Mode.
Lightweight intrusion detection for IoT and embedded devices.
The aim of the project is a lightweight intrusion detection library for embedded devices which supports MSP430 and ARM Cortex based devices. Features include DSP/SIMD support, IoT and embedded protocols, distributed operation, event and history management, tool supported configuration and visualization. There is a Java port that supports less features.
Parallel pairwise correlation computation on Intel Xeon Phi clusters
The first parallel and distributed library for pairwise correlation/dependence computation on Intel Xeon Phi clusters. This library is written in C++ template classes and achieves high speed by exploring the SIMD-instruction-level and thread-level parallelism within Xeon Phis as well as accelerator-level parallelism among multiple Xeon Phis. To facilitate balanced workload distribution, we have proposed a general framework for symmetric all-pairs computation by building provable bijective...
Converts other animations and video files to the ajpeg file format.
It converts other animations (pngs, gifs, videos and image sequences) to the ajpeg file format. It also includes a small viewer for the ajpeg files. It is optimized for SIMD instructions (MMX, SSE and so on) and to use multiple cores/CPUs simultaneously. Minimum system requirements: 1.5 GHz single core processor, 1 GB RAM Recommended system requirements: 2 GHz dual core processor, 2 GB RAM What's new in this version: - most of the interface was redesigned; - added several functions like...
Distributed and Parallel Computing with/for Python.
dispy is a comprehensive, yet easy to use framework for creating and using compute clusters to execute computations in parallel across multiple processors in a single machine (SMP), among many machines in a cluster, grid or cloud. dispy is well suited for data parallel (SIMD) paradigm where a computation (Python function or standalone program) is evaluated with different (large) datasets independently.
To speed up the alignment of DNA reads or assembled contigs against a protein database has been a challenge up to now. The recent tool DIAMOND has signicantly improved the speed of BLASTX and RAPSearch, with a similar degree of sensitivity. Yet for applications like metagenomics, where a large amount of data is involved, DIAMOND still takes too much time. We introduce AC-DIAMOND, which attempts to speed up DIAMOND via better SIMD parallelization and reference indexing. Experimental results show...
Random number library
RandomLib is a C++ interface to the Mersenne Twister random number generator MT19937 and to the SIMD-oriented Fast Mersenne Twister random number generator, SFMT19937. For documentation, visit http://randomlib.sf.net
Embedded JPEG encoder
... are on the download page. For the latest release code look into the repository in 'release-1-0' branch. The SSE2 inplementation is present in 'simd.0' branch in the repository.
DD-AVX: Library of high-precision operations accelerated by AVX
DD-AVX: Library of high-precision operations accelerated by AVX. author's page http://www.slis.tsukuba.ac.jp/~s1530534/index.html e-mail firstname.lastname@example.org Double-Double (DD) precision operations are used to reduce rounding errors and improve the convergence of Krylov subspace methods. This library has Double-Double precision operations accelerated by AVX and AVX2. AVX and AVX2 are intel SIMD instructions. They operate four double precision operation simultaneously. This library...
Simulator for delta calibration parameters
This is a program to simulate the errors in delta printers to help one understand how modifying any of the correction parameters will effect the height map. It can also display errors in x, and y directions, as well as the magnitude of xy, and xyz errors. Custom color gradients are supported, including using alpha, making it easy to find test points for a least squares calibration routine. Images can be exported to any format Qt supports (PNG, BMP, JPG, PDF, etc). Simulation parameters...
Astronomical object/structure detection from 1D and 2D data sets.
Sombrero is a fast wavelet image processing and object detection C library for astronomical images. Sombrero is named after the "Mexican Hat" shape of the wavelet masks used in image convolution and is released under the GNU LGPL library.
High Performance SW to Searching in a Large Forensic DNA Bank. It performs the Needleman-Wunsch algorithm in order to determine the individuals identity.
Smith-Waterman long DNA sequence alignment on Xeon Phi clusters
The first parallel Smith-Waterman algorithm exploiting Intel Xeon Phi clusters to accelerate the alignment of long DNA sequences. This algorithm is written in C++ (with a set of SIMD intrinsic extensions), OpenMP and MPI. The performance evaluation revealed that our algorithm achieves very stable performance, and yields a performance of up to 30.1 GCUPS on a single Xeon Phi and up to 111.4 GCUPS on four Xeon Phis sharing a host.
Templated Fast Fourier Transform in C++
Provides some relatively easier to use than normal templated FFT classes written in C++. This is currently still a work in progress, but the FFT portion (complex and real) both output the correct results, and the code executes very quickly, around only 30% or so slower than the much more difficult to compile FFTW on SSE enabled platforms. Though optimized code only runs on SSE (x86) based platforms, it's a very simple matter to port the existing code to any other platform with different SIMD...
Mathematical library utilising SIMD features of common processors to accelerate many commonly-used algorithms where compilers fear to tread.
Darktable is a virtual lighttable and darkroom for photographers: it manages your digital negatives in a database and lets you view them through a zoomable light table. It also enables you to develop raw images and enhance them.