Showing 117 open source projects for "compute"

View related business solutions
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 1
    Compute Library

    Compute Library

    The Compute Library is a set of computer vision and machine learning

    The Compute Library is a set of computer vision and machine learning functions optimized for both Arm CPUs and GPUs using SIMD technologies. The library provides superior performance to other open-source alternatives and immediate support for new Arm® technologies e.g. SVE2.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    CUDA Core Compute Libraries (CCCL)

    CUDA Core Compute Libraries (CCCL)

    CUDA Core Compute Libraries

    CCCL, or CUDA Core Compute Libraries, is a unified repository that consolidates several foundational CUDA C++ libraries into a single, cohesive development platform. It brings together Thrust, CUB, and libcudacxx, which collectively provide high-level abstractions, low-level performance primitives, and a CUDA-compatible standard library for GPU programming.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    tt-metal

    tt-metal

    TT-NN operator library, and TT-Metalium low level kernel programming

    ...The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 4

    Halide

    A language for fast, portable data-parallel computation

    ...It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building an in-memory representation of a Halide pipeline using Halide's C++ API. This representation can then be compiled to an object file, or a JIT-compile and run in the same process. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    VulkanSceneGraph

    VulkanSceneGraph

    Vulkan & C++17 based Scene Graph Project

    VulkanSceneGraph (VSG), is a modern, cross-platform, high-performance scene graph library built upon Vulkan graphics/compute API. The software is written in C++17 and follows the CppCoreGuidelines and FOSS Best Practices. The source code is published under the MIT License, with the exception of vulkan.h, used for Vulkan extensions, which is under Apache License 2.0. This repository contains C++ headers and source and CMake build scripts to build the libvsg library.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Step 3.5 Flash

    Step 3.5 Flash

    Fast, Sharp & Reliable Agentic Intelligence

    ...Unlike dense models that activate all their parameters for every token, Step 3.5 Flash uses a sparse Mixture-of-Experts (MoE) architecture that selectively engages only about 11 billion of its roughly 196 billion total parameters per token, delivering high-quality reasoning and interaction at far lower compute cost and latency than traditional large models. Its design targets deep reasoning, long-context handling, coding, and real-time responsiveness, making it suitable for building autonomous agents, advanced assistants, and long-chain cognitive workflows without sacrificing performance.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    PyTorch/XLA

    PyTorch/XLA

    Enabling PyTorch on Google TPU

    ...Cloud TPU VM is currently on general availability and provides direct access to the TPU host. The recommended setup for running distributed training on TPU Pods uses the pairing of Compute VM Instance Groups and TPU Pods. Each of the Compute VM in the instance group drives 8 cores on the TPU Pod.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    FlashMLA

    FlashMLA

    FlashMLA: Efficient Multi-head Latent Attention Kernels

    ...The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. On very compute-bound settings, it can reach up to ~660 TFLOPS on H800 SXM5 hardware, while in memory-bound configurations it can push memory throughput to ~3000 GB/s. The team regularly updates it with performance improvements; for example, a 2025 update claims 5 % to 15 % gains on compute-bound workloads while maintaining API compatibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Taichi

    Taichi

    Productive, portable, and performant GPU programming in Python

    Taichi is an open-source, embedded DSL within Python designed for high-performance numerical and physical simulations. It uses JIT compilation (via LLVM and its runtime TiRT) to offload compute-heavy code to CPUs, GPUs, mobile devices, and embedded systems. With built-in support for sparse data structures (SNode), automatic differentiation, AOT deployment, and compatibility with CUDA, Vulkan, Metal, and OpenGL ES, it empowers disciplines like simulation, graphics, AI, and robotics
    Downloads: 2 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10

    Microsoft SEAL

    Easy-to-use and powerful homomorphic encryption library

    ...Developed by the Cryptography and Privacy Research group at Microsoft, it enables software engineers to build end-to-end encrypted data storage and computation services that never have to procure the customer's key. Microsoft SEAL is very easy to use, compile and run in many different environments. Homomorphic encryption is an encryption scheme that allows the cloud to compute directly on the encrypted data, without requiring the data to be decrypted first. This results in encrypted computations remaining encrypted, decrypted only by the data owner using the secret key.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    NVTX (NVIDIA Tools Extension Library)

    NVTX (NVIDIA Tools Extension Library)

    C-based Application Programming Interface (API)

    ...It allows developers to insert markers, ranges, and events directly into their applications, providing contextual insight into how code executes on CPUs and GPUs. These annotations are visualized in tools such as NVIDIA Nsight Systems and Nsight Compute, enabling developers to identify performance bottlenecks, track execution flow, and correlate application behavior with hardware activity. The API is written in C and includes wrappers for C++ and Python, making it accessible across different programming environments and workloads. NVTX is particularly valuable in high-performance computing and AI workloads where understanding concurrency, memory usage, and kernel execution is critical for optimization.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Point Cloud Library

    Point Cloud Library

    A standalone, large scale, open project for 2D/3D image processing

    The Point Cloud Library (PCL) is a standalone, large scale, open project for 2D/3D image and point cloud processing. PCL is released under the terms of the BSD license, and thus free for commercial and research use. Whether you’ve just discovered PCL or you’re a long time veteran, this page contains links to a set of resources that will help consolidate your knowledge on PCL and 3D processing. An additional Wiki resource for developers is available too. To simplify both usage and...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    Dragonfly

    Dragonfly

    A modern replacement for Redis and Memcached

    ...Dragonfly is optimized for modern cloud computing, delivering 25x more throughput and 12x lower snapshotting latency when compared to legacy in-memory data stores like Redis, making it easy to deliver the real-time experience your customers expect. Scaling Redis workloads is expensive due to their inefficient, single-threaded model. Dragonfly is far more compute and memory efficient, resulting in up to 80% lower infrastructure costs. Dragonfly scales vertically first, only requiring clustering at an extremely high scale. This results in a far simpler operational model and a more reliable system.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Cactus

    Cactus

    Low-latency AI inference engine optimized for mobile devices

    ...It supports a wide range of AI tasks including text generation, speech-to-text, vision processing, and retrieval-augmented workflows through a unified API interface. A notable feature of Cactus is its hybrid execution model, which can dynamically route tasks between on-device processing and cloud services when additional compute is required.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    bitnet.cpp

    bitnet.cpp

    Official inference framework for 1-bit LLMs

    ...At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous compute infrastructure. The project’s focus on extreme quantization dramatically reduces memory footprint and energy consumption compared with traditional 16-bit or 32-bit LLMs, making it practical to deploy advanced language understanding and generation models on everyday machines. BitNet is built to scale across architectures, with configurable kernels and tiling strategies that adapt to different hardware, and it supports large models with impressive throughput even on modest resources.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    oneDNN

    oneDNN

    oneAPI Deep Neural Network Library (oneDNN)

    This software was previously known as Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN) and Deep Neural Network Library (DNNL). oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform performance library of basic building blocks for deep learning applications. oneDNN is part of oneAPI. The library is optimized for Intel(R) Architecture Processors, Intel Processor Graphics and Xe Architecture graphics. oneDNN has experimental support for the...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    cuDF

    cuDF

    GPU DataFrame Library

    ...The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    Diligent Core

    Diligent Core

    A modern cross-platform low-level graphics API

    DiligentCore is a low-level, cross-platform rendering library designed to provide a modern graphics abstraction layer over Direct3D11, Direct3D12, OpenGL, Vulkan, and Metal. It’s aimed at developers building high-performance rendering engines and scientific visualization tools. DiligentCore gives precise control over GPU resources and rendering pipelines, while also abstracting away platform-specific boilerplate. The library is modular, extensible, and well-suited for projects that require...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    COCOON

    COCOON

    Confidential Compute Open Network, Decentralized AI Inference on TON

    COCOON is a privacy-aware desktop client framework designed by the developers of Telegram to provide a modern, secure, and extensible environment for building messaging and communication applications. At its core, it combines native desktop performance with web-like flexibility, packing a renderer, UI components, and plugin architecture that allows developers to craft rich experiences similar to those found in native apps. Cocoon’s architecture prioritizes privacy and security, making it...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Raspberry Pi GCC Toolchains

    Raspberry Pi GCC Toolchains

    CI maintained precompiled GCC ARM/ARM64 Toolchains for Raspberry Pi

    This project provides latest Raspberry Pi hardware optimized GCC Cross Compiler & Native (ARM & ARM64) automated Build-Scripts and Precompiled standalone Toolchains binaries, that will save you tons of time & thereby helps you get quickly started with software development on Pi.
    Leader badge
    Downloads: 166 This Week
    Last Update:
    See Project
  • 21

    Meddly

    Multi-terminal and Edge-valued Decision Diagram LibrarY

    Meddly (Multi-terminal and Edge-valued Decision Diagram LibrarY) is a C++ library that natively supports various types of decision diagrams, including BDDs, MDDs, MTMDDs, EV+MDDs, and EV*MDDs. Advanced features include: compact and customizable node storage, configurable garbage collection, and many built-in operations (with compute table support).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Astrolog

    Astrolog

    Astrology calculation, charting, and analysis

    Astrolog is astrology software featuring many types of computation, display, graphics, comparison, and analysis. It supports multiple environments, such as MS Windows and Unix X Windows, with complete C++ source code available. For more information see the Web site: http://www.astrolog.org/astrolog.htm
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 23
    Kompute

    Kompute

    General purpose GPU compute framework built on Vulkan

    General purpose GPU compute framework built on Vulkan to support 1000s of cross-vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous, and optimized for advanced GPU data processing use cases. Backed by the Linux Foundation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Social Network Visualizer

    Social Network Visualizer

    Social Network Analysis and Visualization software

    Visit our new site: http://socnetv.org Social Network Visualizer (SocNetV) is a social network analysis and visualization application. You can draw a social network (graph/digraph) or load an existing one (GraphML, UCINET, Pajek, etc), compute cohesion, centrality, community and structural equivalence metrics and apply various layout algorithms based on actor centrality or prestige scores (i.e. Eigenvector, Betweenness) or on dynamic models (i.e. Kamada-Kawai spring-embedder)
    Downloads: 7 This Week
    Last Update:
    See Project
  • 25
    gVirtualXRay

    gVirtualXRay

    Virtual X-Ray Imaging Library on GPU

    gVirtualXRay is a C++ library to simulate X-ray imaging. It is based on the Beer-Lambert law to compute the absorption of light (i.e. photons) by 3D objects (here polygon meshes). It is implemented on the graphics processing unit (GPU) using the OpenGL Shading Language (GLSL). SimpleGVXR is a smaller library build on the top of gVirtualXRay. It provides wrappers to Python, R, Ruby, Tcl, C#, Java, and GNU Octave.
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB