parallel free download

Showing 32 open source projects for "parallel"

View related business solutions

Artificial Intelligence C++ Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Cloud tools for web scraping and data extraction
Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.

Explore 10,000+ tools
1

LightGBM

Gradient boosting framework based on decision tree algorithms

LightGBM or Light Gradient Boosting Machine is a high-performance, open source gradient boosting framework based on decision tree algorithms. Compared to other boosting frameworks, LightGBM offers several advantages in terms of speed, efficiency and accuracy. Parallel experiments have shown that LightGBM can attain linear speed-up through multiple machines for training in specific settings, all while consuming less memory. LightGBM supports parallel and GPU learning, and can handle large-scale data. It’s become widely-used for ranking, classification and many other machine learning tasks.

Downloads: 5 This Week

Last Update: 2025-02-15
See Project
2

ncnn

High-performance neural network inference framework for mobile

ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...

Downloads: 32 This Week

Last Update: 5 days ago
See Project
3

PaddlePaddle

PArallel Distributed Deep LEarning: Machine Learning Framework

PaddlePaddle is an open source deep learning industrial platform with advanced technologies and a rich set of features that make innovation and application of deep learning easier. It is the only independent R&D deep learning platform in China, and has been widely adopted in various sectors including manufacturing, agriculture and enterprise service. PaddlePaddle covers core deep learning frameworks, basic model libraries, end-to-end development kits and more, with support for both...

Downloads: 4 This Week

Last Update: 2025-12-08
See Project
4

ROOT

Analyzing, storing and visualizing big data, scientifically

...ROOT comes with histogramming capabilities in an arbitrary number of dimensions, curve fitting, statistical modeling, and minimization, to allow the easy setup of a data analysis system that can query and process the data interactively or in batch mode, as well as a general parallel processing framework, RDataFrame, that can considerably speed up an analysis.

Downloads: 6 This Week

Last Update: 2025-11-27
See Project
Leverage AI to Automate Medical Coding
Medical Coding Solution

As a healthcare provider, you should be paid promptly for the services you provide to patients. Slow, inefficient, and error-prone manual coding keeps you from the financial peace you deserve. XpertDox’s autonomous coding solution accelerates the revenue cycle so you can focus on providing great healthcare.

Learn More
5

TensorRT

C++ library for high performance inference on NVIDIA GPUs

...With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers, embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 18 This Week

Last Update: 2025-11-08
See Project
6

ArrayFire

ArrayFire, a general purpose GPU library

ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if you're interested and able to write top performing tensor functions. ...

Downloads: 2 This Week

Last Update: 2025-09-05
See Project
7

OneFlow

OneFlow is a deep learning framework designed to be user-friendly

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient. An extension for OneFlow to target third-party compiler, such as XLA, TensorRT and OpenVINO etc.CUDA runtime is statically linked into OneFlow. OneFlow will work on a minimum supported driver, and any driver beyond. For more information. Distributed performance (efficiency) is the core technical difficulty of the deep learning framework. OneFlow focuses on performance improvement and heterogeneous...

Downloads: 3 This Week

Last Update: 2024-03-11
See Project
8

fairseq2

FAIR Sequence Modeling Toolkit 2

...It supports multi-GPU and multi-node distributed training using DDP, FSDP, and tensor parallelism, capable of scaling up to 70B+ parameter models. The framework integrates seamlessly with PyTorch 2.x features such as torch.compile, Fully Sharded Data Parallel (FSDP), and modern configuration management.

Downloads: 2 This Week

Last Update: 2025-11-07
See Project
9

CTranslate2

Fast inference engine for Transformer models

CTranslate2 is a C++ and Python library for efficient inference with Transformer models. The project implements a custom runtime that applies many performance optimization techniques such as weights quantization, layers fusion, batch reordering, etc., to accelerate and reduce the memory usage of Transformer models on CPU and GPU. The execution is significantly faster and requires less resources than general-purpose deep learning frameworks on supported models and tasks thanks to many...

Downloads: 2 This Week

Last Update: 2026-01-06
See Project
All-in-one security tool helps you prevent ransomware and breaches.
SIEM + Detection and Response for IT Teams

Blumira’s detection and response platform enables faster resolution of threats to help you stop ransomware attacks and prevent data breaches. We surface real threats, providing meaningful findings so you know what to prioritize. With our 3-step rapid response, you can automatically block known threats, use our playbooks for easy remediation, or contact our security team for additional guidance. Our responsive security team helps with onboarding, triage and ongoing consultations to continuously help your organization improve your security coverage.

Learn More
10

EnvPool

C++-based high-performance parallel environment execution engine

EnvPool is a fast, asynchronous, and parallel RL environment library designed for scaling reinforcement learning experiments. Developed by SAIL at Singapore, it leverages C++ backend and Python frontend for extremely high-speed environment interaction, supporting thousands of environments running in parallel on a single machine. It's compatible with Gymnasium API and RLlib, making it suitable for scalable training pipelines.

Downloads: 0 This Week

Last Update: 2025-03-13
See Project
11

Evolutionary Computation Framework

C++ framework for application of any type of evolutionary computation.

ECF is a framework intended for application of any type of evolutionary computation (GA/GP, DE, Clonalg, ES, PSO, ABC, GAn, local search...). It offers simplicity for the end-user (parameterless usage, tutorial) and customization for experienced EC practicioners.

Downloads: 2 This Week

Last Update: 2023-11-27
See Project
12

YOLO ROS

YOLO ROS: Real-Time Object Detection for ROS

...Darknet on the CPU is fast (approximately 1.5 seconds on an Intel Core i7-6700HQ CPU @ 2.60GHz × 8) but it's like 500 times faster on GPU! You'll have to have an Nvidia GPU and you'll have to install CUDA. The CMakeLists.txt file automatically detects if you have CUDA installed or not. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia.

Downloads: 0 This Week

Last Update: 2022-08-12
See Project
13

SINGA

A distributed deep learning platform

Apache SINGA is an Apache Top Level Project, focusing on distributed training of deep learning and machine learning models. Various example deep learning models are provided in SINGA repo on Github and on Google Colab. SINGA supports data parallel training across multiple GPUs (on a single node or across different nodes). SINGA supports various popular optimizers including stochastic gradient descent with momentum, Adam, RMSProp, and AdaGrad, etc. SINGA records the computation graph and applies the backward propagation automatically after forward propagation. The optimization of memory are implemented in the Device class. ...

Downloads: 0 This Week

Last Update: 2022-08-05
See Project
14

CUDA-JMI

Tool for feature selection using the JMI metric and multiple GPUs

CUDA-JMI is a parallel tool to accelerate the feature selection process using Joint Mutual Information as metric. This tool receives as input a file with ARFF, CVS or LIBSVM extensions that contais the values of m individuals and n features and returns a file with those features that provide more non-rendundant information.

Downloads: 0 This Week

Last Update: 2019-12-12
See Project
15

Scalable Distributed Deep-RL

A TensorFlow implementation of Scalable Distributed Deep-RL

...IMPALA introduced a new paradigm for efficiently training agents across large-scale environments by decoupling acting and learning processes. In this architecture, multiple actor processes interact with their environments in parallel to collect trajectories, which are then asynchronously sent to a centralized learner for policy updates. The learner uses importance weighting to correct for policy lag between actors and the learner, enabling stable off-policy training at scale. This design allows the system to scale efficiently to hundreds of environments and billions of frames while maintaining sample efficiency and stability. ...

Downloads: 0 This Week

Last Update: 2 days ago
See Project
16

Virtual Laboratory Environment

A multi-modeling and simulation environment to study complex systems

VLE is a multi-modeling and simulation environment to study complex dynamic systems. VLE is based on the discrete event specification DEVS. and it implements the DSDE formalism (A merge of Dynamic Structure DEVS, DSDEVS, with Parallel DEVS, PDEVS). VLE provides a complete set of C++ libraries, called VFL (VLE Foundation Libraries), to develop DEVS models, to gets results of simulations, to launch simulation on cluster. The models can be developed with the DEVS formalism or with the classical mathematical formalism: Ordinary Differential Equation with Euler, Range-Kutta or QSS integrator, Finite state automaton (FDDEVS, UML State chart, Hybrid Petri net). ...

2 Reviews

Downloads: 10 This Week

Last Update: 2019-02-04
See Project
17

LightPCC

Parallel pairwise correlation computation on Intel Xeon Phi clusters

The first parallel and distributed library for pairwise correlation/dependence computation on Intel Xeon Phi clusters. This library is written in C++ template classes and achieves high speed by exploring the SIMD-instruction-level and thread-level parallelism within Xeon Phis as well as accelerator-level parallelism among multiple Xeon Phis. To facilitate balanced workload distribution, we have proposed a general framework for symmetric all-pairs computation by building provable bijective functions between job identifier and coordinate space for the first time.

Downloads: 0 This Week

Last Update: 2017-04-05
See Project
18

Genetic Programming Classifier

Genetic Programming Classifier is a distributed evolutionary data classification program. It uses the ensemble method implemented under a parallel co-evolutionary Genetic Programming technique.

Downloads: 0 This Week

Last Update: 2016-11-29
See Project
19

aCompute

Aims to enable researcher to tap in to mobile computing capability

This is a software agent based computing program that will enable researchers and other users to tap in computing power of machine available by sharing work load on the fly with zero configuration on network & resources A self organizing agent program that will understand network and its resource. where as the only job left to researcher is to split up jobs in several chunks of programs either parallel or sequential jobs and go issue the job (A visual Modeler or Scripting support need to be yet designed) Software agents will automatically manage the rest or resource management, sharing , cloning of tasks etc. new resources can be added and removed from the system on fly; in layman terms the project will create an agent program that enable sharing & execution of program among all the available resources whether it be desktop, laptop, pda . thereby one can accelerate research to the very extent of resource availability with out bothering about anything... ...

Downloads: 0 This Week

Last Update: 2015-09-16
See Project
20

Thot toolkit for SMT

The Thot toolkit repository has moved to http://daormar.github.io/thot/ Thot is a toolkit for statistical machine translation. The new Thot toolkit includes fully automatic and interactive machine translation, incremental training of statistical models, parallel estimation, ...

Downloads: 0 This Week

Last Update: 2015-03-17
See Project
21

Genetic Programming in OpenCL

Genetic Programming in OpenCL is a parallel implementation of genetic programming targeted at heterogeneous devices, such as CPU and GPU. It is written in OpenCL, an open standard for portable parallel programming across many computing platforms.

1 Review

Downloads: 0 This Week

Last Update: 2014-10-04
See Project
22

Parallel Neural Networks

Neural networks in CUDA & OpenCL with back propagation algorithm

...It's aim is to compare the efficiency of both technologies and to check where which hacks works better. What is more one of my tasks is to compare different ways of decomposing computations in parallel.

Downloads: 0 This Week

Last Update: 2016-10-11
See Project
23

Crackpot

Crackpot is an action planning system based on iterative repair. It features real-time optimization and revision, auto-scaling on parallel architectures, and handles temporal reasoning, resources, and an open, complex planning world.

Downloads: 0 This Week

Last Update: 2014-06-10
See Project
24

FeedForwardNeuralNetworkC++

Feedforward Neural Network writen in C++

Feedforward Neural Network writen in C++ serial and parallelized in TBB library. Also using Autotune library for best parallel performance.

1 Review

Downloads: 0 This Week

Last Update: 2012-03-28
See Project
25

Parallel Reinforcement Evolutionary ANN

Parallel Reinforcement Evolutionary Artificial Neural Networks (PREANN) is a framework of flexible multi-layer ANN's with reinforcement learning based on genetic algorithms and a parallel implementation (using XMM registers and NVIDIA's CUDA).

Downloads: 0 This Week

Last Update: 2013-05-02
See Project