simd free download - SourceForge

Showing 21 open source projects for "simd"

View related business solutions

Software Development C++ Clear Filters & Widen Search

Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

SIMD

C++ wrappers for SIMD intrinsics

SIMD is a C++ library that provides portable abstractions over SIMD (Single Instruction, Multiple Data) instructions, enabling developers to write high-performance vectorized code without dealing directly with architecture-specific intrinsics. SIMD instructions allow a single operation to be applied to multiple data elements simultaneously, significantly accelerating numerical and data-parallel computations.

Downloads: 1 This Week

Last Update: 2026-04-29
See Project
2

HLSL++

Math library using HLSL syntax with multiplatform SIMD support

...It provides vector, matrix, and math operations with a syntax identical or very similar to HLSL, allowing seamless transition between shader code and application code. The library is optimized for performance and supports SIMD instructions across multiple architectures, including SSE, AVX, AVX2, AVX512, and ARM NEON, ensuring high efficiency on modern hardware. It also extends beyond standard HLSL capabilities by introducing additional features such as quaternion support, advanced matrix operations, and extended vector types like float8. The library is particularly valuable for game developers who need consistency between CPU and GPU computations, reducing errors and improving maintainability.

Downloads: 1 This Week

Last Update: 2026-05-05
See Project
3

Google Highway

Performance-portable, length-agnostic SIMD with runtime dispatch

Google Highway is a high-performance C++ library designed to provide portable SIMD (Single Instruction, Multiple Data) vectorization across multiple CPU architectures while maintaining predictable and efficient behavior. It abstracts low-level vector intrinsics into a consistent API that maps closely to hardware instructions, allowing developers to write high-performance code without relying heavily on compiler auto-vectorization.

Downloads: 0 This Week

Last Update: 2026-04-23
See Project
4

ispc

Intel SPMD Program Compiler

...Under the SPMD model, the programmer writes a program that generally appears to be a regular serial program, though the execution model is actually that a number of program instances execute in parallel on the hardware. ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs and GPUs; it frequently provides a 3x or more speedup on architectures with 4-wide vector SSE units and 5x-6x on architectures with 8-wide AVX vector units, without any of the difficulty of writing intrinsics code. Parallelization across multiple cores is also supported by ispc, making it possible to write programs that achieve performance improvement that scales by both numbers of cores and vector unit size. ...

Downloads: 0 This Week

Last Update: 2026-02-04
See Project
Error to trace to log to deploy. One click. No SSH.
Catch the cause before the pager goes off.

AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.

Free 30 days.
5

XNNPACK

High-efficiency floating-point neural network inference operators

...Rather than serving as a standalone ML framework, XNNPACK provides high-performance computational primitives—such as convolutions, pooling, activation functions, and arithmetic operations—that are integrated into higher-level frameworks like TensorFlow Lite, PyTorch Mobile, ONNX Runtime, TensorFlow.js, and MediaPipe. The library is written in C/C++ and designed for maximum portability, efficiency, and performance, leveraging platform-specific instruction sets (e.g., NEON, AVX, SIMD) for optimized execution. It supports NHWC tensor layouts and allows flexible striding along the channel dimension to efficiently handle channel-split and concatenation operations without additional cost.

Downloads: 2 This Week

Last Update: 7 days ago
See Project
6

UniSIMD-assembler

SIMD macro assembler unified for ARM, MIPS, PPC and x86

UniSIMD assembler is a high-level C/C++ macro assembler framework unified across ARM, MIPS, POWER and x86 architectures. It establishes a subset of both BASE and SIMD instruction sets with clearly defined common API, so that application logic can be written and maintained in one place without code replication. The assembler itself isn't a separate tool, but rather a collection of C/C++ header files, which applications need to include directly in order to use. At present, Intel SSE/SSE2/SSE4 and AVX/AVX2/AVX-512 (32/64-bit x86 ISAs), ARMv7 NEON/NEONv2, ARMv8 AArch32 and AArch64 NEON, SVE (32/64-bit ARM ISAs), MIPS 32/64-bit r5/r6 MSA and POWER 32/64-bit VMX/VSX (little/big-endian ISAs) are mostly implemented (/w horizontal reductions) although scalar improvements, wider SIMD vectors with zeroing/merging predicates in 3/4-operand instructions are planned as extensions to current 2/3-operand SPMD-driven vertical SIMD ISA. ...

Downloads: 0 This Week

Last Update: 2024-11-20
See Project
7

HighwayHash

Fast strong hash functions: SipHash/HighwayHash

HighwayHash is a fast, keyed hash function intended for scenarios where you need strong, DoS-resistant hashing without the full overhead of a general-purpose cryptographic hash. It’s designed to defeat hash-flooding attacks by mixing input with wide SIMD operations and a branch-free inner loop, so adversaries can’t cheaply craft many colliding keys. The implementation targets multiple CPU families with vectorized code paths while keeping a portable fallback, yielding high throughput across platforms. It exposes simple one-shot and streaming APIs, so you can hash short keys or long byte streams with the same function. ...

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
8

GNSS-SDR

An open source software-defined GNSS receiver

An open source software-defined Global Navigation Satellite Systems (GNSS) receiver written in C++ and based on the GNU Radio framework.

2 Reviews

Downloads: 1,728 This Week

Last Update: 1 day ago
See Project
9

LinAsm

Collection of fast and optimized assembly libraries for x86-64 Linux

LinAsm is collection of very fast and SIMD optimized assembly written libraries for x86-64 Linux. It implements many common and widely used algorithms for array manipulations: searching, sorting, arithmetic and vector operations, unit conversions; fast mathematical and statistic functions; numbers and time converting algorithms; finite impulse response (FIR) digital filters; spectrum analysis algorithms, Fast Hartley transformation; CPU cache friendly functions and extremely fast abstract data types (ADT) such as hash tables b-trees, and much more. ...

1 Review

Downloads: 17 This Week

Last Update: 2026-06-09
See Project
Atera - an All-in-one platform for IT management
Ideal for IT departments and MSPs (managed service providers)

Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!

Try Atera now
10

Armadillo

fast C++ library for linear algebra & scientific computing

* Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads:...

Downloads: 2,799 This Week

Last Update: 3 days ago
See Project
11

Klogg

Really fast log explorer based on glogg project

Klogg is an open source multi-platform GUI application to search through all kinds of text log files using regular expressions. It has started as fork of glogg project created by Nicolas Bonnefon and has evolved into a separate project with a lot of new features and improvements.

Downloads: 73 This Week

Last Update: 2024-06-25
See Project
12

libfacedetection

Library for face detection in images

...The source code does not depend on any other libraries. What you need is just a C++ compiler. You can compile the source code under Windows, Linux, ARM and any platform with a C++ compiler. SIMD instructions are used to speed up the detection. You can enable AVX2 if you use Intel CPU or NEON for ARM. The model file has also been provided in directory ./models/. The file examples/detect-image.cpp and examples/detect-camera.cpp show how to use the library. The library was trained by libfacedetection.train. You can copy the files in directory src/ into your project, and compile them as the other files in your project. ...

Downloads: 0 This Week

Last Update: 2021-09-24
See Project
13

PPface

PPface is vector processor emulator / simulator

PPface is vector processor emulator / simulator (SIMD array processor with 1-bit processing elements). VEPRAN language used to design parallel algorithms and operate data slices. The system allows to visualize algorithm work by viewing vector memory and registers, it supports debugging and animated program execution. To run this app on 64-bit system please install Windows Virtual PC and Windows XP Mode.

1 Review

Downloads: 0 This Week

Last Update: 2020-03-09
See Project
14

Simd

High performance image processing library in C++

...The algorithms are optimized with using of different SIMD CPU extensions. In particular the library supports following CPU extensions: SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2 and AVX-512 for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM. The Simd Library has C API and also contains useful C++ classes and functions to facilitate access to C API. The library supports dynamic and static linking, 32-bit and 64-bit Windows, Android and Linux, MSVS, G++ and Clang compilers, MSVS project and CMake build systems.

3 Reviews

Downloads: 18 This Week

Last Update: 2019-02-01
See Project
15

SWAPHI-LS: Alignment on Xeon Phi Cluster

Smith-Waterman long DNA sequence alignment on Xeon Phi clusters

The first parallel Smith-Waterman algorithm exploiting Intel Xeon Phi clusters to accelerate the alignment of long DNA sequences. This algorithm is written in C++ (with a set of SIMD intrinsic extensions), OpenMP and MPI. The performance evaluation revealed that our algorithm achieves very stable performance, and yields a performance of up to 30.1 GCUPS on a single Xeon Phi and up to 111.4 GCUPS on four Xeon Phis sharing a host.

Downloads: 0 This Week

Last Update: 2016-05-13
See Project
16

ddd-3.3.12patch

The patch comm-manag allows ddd to start up correctly. Loads all the debug info files, Breakpoints are correct, project is correctly saved and loaded. ddd-3.3.12-260210 patch makes usable register display for MMX and SIMD instructions 32 - 64 bit.

Downloads: 0 This Week

Last Update: 2012-09-25
See Project
17

Vector3D SSE

A C++ header library for fast operations on vectors/matrices (3D/3x3) using Streaming SIMD Extensions (SSE, SSE2, SSE3, SSE4); Tends to be used in 3D graphics applications and game developement.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
18

Fast SIFT Image Features Library

A cross-platform library that computes fast and accurate SIFT image features. libsiftfast provides Octave/Matlab scripts, a command line interface, and a python interface (siftfastpy). Optimized with SIMD instructions and OpenMP .

2 Reviews

Downloads: 0 This Week

Last Update: 2015-12-02
See Project
19

SSEPlus

SSEPlus is a SIMD function library. It provides optimized emulation for newer SSE instructions. It also provides a rich set of high performance routines for common operations such as arithmetic, bitwise logic, and data packing and unpacking.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
20

parallel for

A data parallel scientific programming model. Compiles efficiently to different platforms like distributed memory (MPI), shared memory multi-processor (pthreads), Cell BE processor, Nvidia Cuda, SIMD vectorization (SSE, Altivec), and sequential C++ code.

Downloads: 0 This Week

Last Update: 2013-04-10
See Project
21

Cross-platform SIMD C Headers

A cross-platform, cross-compiler, cross-CPU C header library for programming with SIMD instruction sets. X86 (MMX/SSE/SSE2) GCC and MSVC, PPC Altivec GCC, WMMX ARM GCC, and software emulated SIMD are supported.

Downloads: 0 This Week

Last Update: 2015-06-28
See Project