simd free download - SourceForge

Showing 25 open source projects for "simd"

View related business solutions

Software Development Mac Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
1

SIMD

C++ wrappers for SIMD intrinsics

SIMD is a C++ library that provides portable abstractions over SIMD (Single Instruction, Multiple Data) instructions, enabling developers to write high-performance vectorized code without dealing directly with architecture-specific intrinsics. SIMD instructions allow a single operation to be applied to multiple data elements simultaneously, significantly accelerating numerical and data-parallel computations.

Downloads: 1 This Week

Last Update: 2026-04-29
See Project
2

HLSL++

Math library using HLSL syntax with multiplatform SIMD support

...It provides vector, matrix, and math operations with a syntax identical or very similar to HLSL, allowing seamless transition between shader code and application code. The library is optimized for performance and supports SIMD instructions across multiple architectures, including SSE, AVX, AVX2, AVX512, and ARM NEON, ensuring high efficiency on modern hardware. It also extends beyond standard HLSL capabilities by introducing additional features such as quaternion support, advanced matrix operations, and extended vector types like float8. The library is particularly valuable for game developers who need consistency between CPU and GPU computations, reducing errors and improving maintainability.

Downloads: 1 This Week

Last Update: 2026-05-05
See Project
3

Google Highway

Performance-portable, length-agnostic SIMD with runtime dispatch

Google Highway is a high-performance C++ library designed to provide portable SIMD (Single Instruction, Multiple Data) vectorization across multiple CPU architectures while maintaining predictable and efficient behavior. It abstracts low-level vector intrinsics into a consistent API that maps closely to hardware instructions, allowing developers to write high-performance code without relying heavily on compiler auto-vectorization.

Downloads: 0 This Week

Last Update: 2026-04-23
See Project
4

LoopVectorization.jl

Macro(s) for vectorizing loops

LoopVectorization.jl is a Julia package for accelerating numerical loops by automatically applying SIMD (Single Instruction, Multiple Data) vectorization and other low-level optimizations. It analyzes loops and generates highly efficient code that leverages CPU vector instructions, making it ideal for performance-critical computing in fields such as scientific computing, signal processing, and machine learning.

Downloads: 0 This Week

Last Update: 2026-05-30
See Project
Auth0 B2B Essentials: SSO, MFA, and RBAC Built In
Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.

Sign Up Free
5

ispc

Intel SPMD Program Compiler

...Under the SPMD model, the programmer writes a program that generally appears to be a regular serial program, though the execution model is actually that a number of program instances execute in parallel on the hardware. ispc compiles a C-based SPMD programming language to run on the SIMD units of CPUs and GPUs; it frequently provides a 3x or more speedup on architectures with 4-wide vector SSE units and 5x-6x on architectures with 8-wide AVX vector units, without any of the difficulty of writing intrinsics code. Parallelization across multiple cores is also supported by ispc, making it possible to write programs that achieve performance improvement that scales by both numbers of cores and vector unit size. ...

Downloads: 0 This Week

Last Update: 2026-02-04
See Project
6

node-rs

Node.js bindings Rust crates

When Node.js meets Rust. Make rust crates binding to Node.js use napi-rs.

Downloads: 0 This Week

Last Update: 2024-12-05
See Project
7

torchvision

Datasets, transforms and models specific to Computer Vision

The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. We recommend Anaconda as Python package management system. Torchvision currently supports Pillow (default), Pillow-SIMD, which is a much faster drop-in replacement for Pillow with SIMD, if installed will be used as the default. Also, accimage, if installed can be activated by calling torchvision.set_image_backend('accimage'), libpng, which can be installed via conda conda install libpng or any of the package managers for debian-based and RHEL-based Linux distributions, and libjpeg, which can be installed via conda conda install jpeg or any of the package managers for debian-based and RHEL-based Linux distributions. ...

Downloads: 1 This Week

Last Update: 2026-06-09
See Project
8

Polars

Dataframes powered by a multithreaded, vectorized query engine

Polars is a high-performance, multi-language DataFrame library built in Rust using Apache Arrow. It delivers blazing-fast, vectorized, and parallel data manipulation with both eager and lazy execution, making it an excellent tool for data processing in Python, Rust, Node.js, R, and SQL contexts.

Downloads: 0 This Week

Last Update: 2026-06-04
See Project
9

Numba

NumPy aware dynamic Python compiler using LLVM

Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code. Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN. You don't need to replace the Python interpreter, run a separate compilation step, or even have a C/C++ compiler installed. Just apply one of the Numba decorators to your...

Downloads: 1 This Week

Last Update: 2026-04-23
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

XNNPACK

High-efficiency floating-point neural network inference operators

...Rather than serving as a standalone ML framework, XNNPACK provides high-performance computational primitives—such as convolutions, pooling, activation functions, and arithmetic operations—that are integrated into higher-level frameworks like TensorFlow Lite, PyTorch Mobile, ONNX Runtime, TensorFlow.js, and MediaPipe. The library is written in C/C++ and designed for maximum portability, efficiency, and performance, leveraging platform-specific instruction sets (e.g., NEON, AVX, SIMD) for optimized execution. It supports NHWC tensor layouts and allows flexible striding along the channel dimension to efficiently handle channel-split and concatenation operations without additional cost.

Downloads: 2 This Week

Last Update: 7 days ago
See Project
11

Zerocopy

Zerocopy makes zero-cost memory manipulation effortless

Zerocopy is a Rust library designed to make zero-cost memory manipulation both safe and effortless. It allows developers to reinterpret or convert raw byte sequences into structured types—and vice versa—without writing unsafe code directly. The crate provides safe abstractions for transmuting data while preserving Rust’s strict safety guarantees, removing the need for manual memory manipulation. Zerocopy introduces a suite of conversion traits such as TryFromBytes, FromBytes, IntoBytes, and...

Downloads: 0 This Week

Last Update: 2026-06-09
See Project
12

UniSIMD-assembler

SIMD macro assembler unified for ARM, MIPS, PPC and x86

UniSIMD assembler is a high-level C/C++ macro assembler framework unified across ARM, MIPS, POWER and x86 architectures. It establishes a subset of both BASE and SIMD instruction sets with clearly defined common API, so that application logic can be written and maintained in one place without code replication. The assembler itself isn't a separate tool, but rather a collection of C/C++ header files, which applications need to include directly in order to use. At present, Intel SSE/SSE2/SSE4 and AVX/AVX2/AVX-512 (32/64-bit x86 ISAs), ARMv7 NEON/NEONv2, ARMv8 AArch32 and AArch64 NEON, SVE (32/64-bit ARM ISAs), MIPS 32/64-bit r5/r6 MSA and POWER 32/64-bit VMX/VSX (little/big-endian ISAs) are mostly implemented (/w horizontal reductions) although scalar improvements, wider SIMD vectors with zeroing/merging predicates in 3/4-operand instructions are planned as extensions to current 2/3-operand SPMD-driven vertical SIMD ISA. ...

Downloads: 0 This Week

Last Update: 2024-11-20
See Project
13

sleef

Vectorized libm

SLEEF stands for SIMD Library for Evaluating Elementary Functions. SLEEF implements vectorized versions of all C99 math functions, that utilize SIMD instructions of modern processors to make computation more efficient. The library also includes vectorized DFT subroutines.

Downloads: 0 This Week

Last Update: 2025-01-28
See Project
14

Vector Pascal Compiler

Vector Pascal is a language targeted at SIMD multi-core instruction-sets such as the AVX and SSE2 or x86-64-v3. It has a SIMD compiler which supports parallel vector operations, loop unrolling, common sub expression removal etc. It is implemented in Java.

1 Review

Downloads: 4 This Week

Last Update: 2 days ago
See Project
15

HighwayHash

Fast strong hash functions: SipHash/HighwayHash

HighwayHash is a fast, keyed hash function intended for scenarios where you need strong, DoS-resistant hashing without the full overhead of a general-purpose cryptographic hash. It’s designed to defeat hash-flooding attacks by mixing input with wide SIMD operations and a branch-free inner loop, so adversaries can’t cheaply craft many colliding keys. The implementation targets multiple CPU families with vectorized code paths while keeping a portable fallback, yielding high throughput across platforms. It exposes simple one-shot and streaming APIs, so you can hash short keys or long byte streams with the same function. ...

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
16

GNSS-SDR

An open source software-defined GNSS receiver

An open source software-defined Global Navigation Satellite Systems (GNSS) receiver written in C++ and based on the GNU Radio framework.

2 Reviews

Downloads: 1,589 This Week

Last Update: 1 day ago
See Project
17

Armadillo

fast C++ library for linear algebra & scientific computing

* Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads:...

Downloads: 2,993 This Week

Last Update: 3 days ago
See Project
18

libjpeg-turbo

SIMD-accelerated libjpeg-compatible JPEG codec library

libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression on x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is generally 2-6x as fast as libjpeg, all else being equal. On other types of systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by virtue of its highly-optimized Huffman coding routines.

16 Reviews

Downloads: 43,119 This Week

Last Update: 2024-01-13
See Project
19

TurboPFor

Fastest Integer Compression

Fastest Integer Compression. ALL functions are available for AMD/Intel, 64-bit ARMv8 NEON Linux+MacOS/M1 & Power9 Altivec. 100% C (C++ headers), as simple as memcpy. OS:Linux amd64, arm64, Power9, MacOs (Amd/intel + Apple M1).

Downloads: 0 This Week

Last Update: 2024-05-30
See Project
20

Klogg

Really fast log explorer based on glogg project

Klogg is an open source multi-platform GUI application to search through all kinds of text log files using regular expressions. It has started as fork of glogg project created by Nicolas Bonnefon and has evolved into a separate project with a lot of new features and improvements.

Downloads: 75 This Week

Last Update: 2024-06-25
See Project
21

jpegant

Embedded JPEG encoder

...Release 1.1 source code and Windows executables are on the download page. For the latest release code look into the repository in 'release-1-0' branch. The SSE2 inplementation is present in 'simd.0' branch in the repository.

Downloads: 0 This Week

Last Update: 2016-11-21
See Project
22

Vector3D SSE

A C++ header library for fast operations on vectors/matrices (3D/3x3) using Streaming SIMD Extensions (SSE, SSE2, SSE3, SSE4); Tends to be used in 3D graphics applications and game developement.

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
23

SSEPlus

SSEPlus is a SIMD function library. It provides optimized emulation for newer SSE instructions. It also provides a rich set of high performance routines for common operations such as arithmetic, bitwise logic, and data packing and unpacking.

1 Review

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
24

parallel for

A data parallel scientific programming model. Compiles efficiently to different platforms like distributed memory (MPI), shared memory multi-processor (pthreads), Cell BE processor, Nvidia Cuda, SIMD vectorization (SSE, Altivec), and sequential C++ code.

Downloads: 0 This Week

Last Update: 2013-04-10
See Project
25

Cross-platform SIMD C Headers

A cross-platform, cross-compiler, cross-CPU C header library for programming with SIMD instruction sets. X86 (MMX/SSE/SSE2) GCC and MSVC, PPC Altivec GCC, WMMX ARM GCC, and software emulated SIMD are supported.

Downloads: 0 This Week

Last Update: 2015-06-28
See Project