compute free download

Showing 14 open source projects for "compute"

View related business solutions

Artificial Intelligence C++ Clear Filters & Widen Search

Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
1

Compute Library

The Compute Library is a set of computer vision and machine learning

The Compute Library is a set of computer vision and machine learning functions optimized for both Arm CPUs and GPUs using SIMD technologies. The library provides superior performance to other open-source alternatives and immediate support for new Arm® technologies e.g. SVE2.

Downloads: 4 This Week

Last Update: 2026-04-15
See Project
2

tt-metal

TT-NN operator library, and TT-Metalium low level kernel programming

...The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.

Downloads: 13 This Week

Last Update: 5 days ago
See Project
3

Step 3.5 Flash

Fast, Sharp & Reliable Agentic Intelligence

...Unlike dense models that activate all their parameters for every token, Step 3.5 Flash uses a sparse Mixture-of-Experts (MoE) architecture that selectively engages only about 11 billion of its roughly 196 billion total parameters per token, delivering high-quality reasoning and interaction at far lower compute cost and latency than traditional large models. Its design targets deep reasoning, long-context handling, coding, and real-time responsiveness, making it suitable for building autonomous agents, advanced assistants, and long-chain cognitive workflows without sacrificing performance.

Downloads: 5 This Week

Last Update: 2026-04-03
See Project
4

PyTorch/XLA

Enabling PyTorch on Google TPU

...Cloud TPU VM is currently on general availability and provides direct access to the TPU host. The recommended setup for running distributed training on TPU Pods uses the pairing of Compute VM Instance Groups and TPU Pods. Each of the Compute VM in the instance group drives 8 cores on the TPU Pod.

Downloads: 0 This Week

Last Update: 2025-11-17
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
5

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

...The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. On very compute-bound settings, it can reach up to ~660 TFLOPS on H800 SXM5 hardware, while in memory-bound configurations it can push memory throughput to ~3000 GB/s. The team regularly updates it with performance improvements; for example, a 2025 update claims 5 % to 15 % gains on compute-bound workloads while maintaining API compatibility.

Downloads: 0 This Week

Last Update: 2026-04-29
See Project
6

Cactus

Low-latency AI inference engine optimized for mobile devices

...It supports a wide range of AI tasks including text generation, speech-to-text, vision processing, and retrieval-augmented workflows through a unified API interface. A notable feature of Cactus is its hybrid execution model, which can dynamically route tasks between on-device processing and cloud services when additional compute is required.

Downloads: 2 This Week

Last Update: 2026-04-18
See Project
7

bitnet.cpp

Official inference framework for 1-bit LLMs

...At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous compute infrastructure. The project’s focus on extreme quantization dramatically reduces memory footprint and energy consumption compared with traditional 16-bit or 32-bit LLMs, making it practical to deploy advanced language understanding and generation models on everyday machines. BitNet is built to scale across architectures, with configurable kernels and tiling strategies that adapt to different hardware, and it supports large models with impressive throughput even on modest resources.

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
8

oneDNN

oneAPI Deep Neural Network Library (oneDNN)

This software was previously known as Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN) and Deep Neural Network Library (DNNL). oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform performance library of basic building blocks for deep learning applications. oneDNN is part of oneAPI. The library is optimized for Intel(R) Architecture Processors, Intel Processor Graphics and Xe Architecture graphics. oneDNN has experimental support for the...

Downloads: 3 This Week

Last Update: 1 day ago
See Project
9

COCOON

Confidential Compute Open Network, Decentralized AI Inference on TON

COCOON is a privacy-aware desktop client framework designed by the developers of Telegram to provide a modern, secure, and extensible environment for building messaging and communication applications. At its core, it combines native desktop performance with web-like flexibility, packing a renderer, UI components, and plugin architecture that allows developers to craft rich experiences similar to those found in native apps. Cocoon’s architecture prioritizes privacy and security, making it...

Downloads: 0 This Week

Last Update: 2026-04-10
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
10

Procgen

Procedurally-Generated Game-Like Gym-Environments

Procgen (short for Procedural Generation Benchmark) is a suite of 16 procedurally generated, game-like reinforcement learning environments designed to evaluate generalization and sample efficiency in RL agents. Unlike fixed, deterministic environments, Procgen generates new levels (layouts, obstacles, visual variation) each episode, making it impossible for an agent to simply memorize trajectories. The environments are designed to run very quickly (thousands of steps per second on a single...

Downloads: 3 This Week

Last Update: 2025-10-03
See Project
11

MACE

Deep learning inference framework optimized for mobile platforms

Mobile AI Compute Engine (or MACE for short) is a deep learning inference framework optimized for mobile heterogeneous computing on Android, iOS, Linux and Windows devices. Runtime is optimized with NEON, OpenCL and Hexagon, and Winograd algorithm is introduced to speed up convolution operations. The initialization is also optimized to be faster.

Downloads: 0 This Week

Last Update: 2022-01-13
See Project
12

uTensor

TinyML AI inference library

uTensor is an embedded machine learning inference framework designed to run neural network models on resource-constrained devices such as microcontrollers and Internet-of-Things hardware. The project focuses on enabling TinyML deployments by translating trained machine learning models into efficient C++ code that can execute directly on embedded systems. Instead of training models on-device, the framework uses an offline workflow that converts trained TensorFlow graphs into optimized...

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
13

Gaussian Mixture Model and Regression

GMM-GMR is a light package of functions in C/C++ to compute Gaussian Mixture Model (GMM) and Gaussian Mixture Regression (GMR). It allows to encode any dataset in a GMM, and GMR can then be used to retrieve partial data by specifying the desired inputs.

Downloads: 0 This Week

Last Update: 2015-06-25
See Project
14

Distributed breve

"Distributed breve" (distbreve) is an open-source software package to make the process of implementing asynchronous, parallelizable steve code running under the breve simulation environment easily distributable amongst as many compute nodes as possible.

1 Review

Downloads: 0 This Week

Last Update: 2013-03-11
See Project