compute free download

61 projects for "compute" with 2 filters applied:

Artificial Intelligence ChromeOS Clear Filters & Widen Search

$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
1

tt-metal

TT-NN operator library, and TT-Metalium low level kernel programming

...The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.

Downloads: 2 This Week

Last Update: 4 days ago
See Project
2

DeepSeekMath-V2

Towards self-verifiable mathematical reasoning

DeepSeekMath-V2 is a large-scale open-source AI model designed specifically for advanced mathematical reasoning, theorem proving, and rigorous proof verification. It’s built by DeepSeek as a successor to their earlier math-specialist models. Unlike general-purpose LLMs that might generate plausible-looking math but sometimes hallucinate or mishandle rigorous logic, Math-V2 is engineered to not only generate solutions but also self-verify them, meaning it examines the derivations, checks...

Downloads: 10 This Week

Last Update: 2025-12-01
See Project
3

clip-retrieval

Easily compute clip embeddings and build a clip retrieval system

clip-retrieval is an open-source toolkit designed to build large-scale semantic search systems for images and text by leveraging CLIP embeddings to enable multimodal retrieval. It allows developers to compute embeddings for both images and text efficiently and then index them for fast similarity search across massive datasets. The system is optimized for performance and scalability, capable of processing tens or even hundreds of millions of embeddings using GPU acceleration. It includes components for inference, indexing, filtering, and serving results through APIs, making it a complete pipeline for building production-ready retrieval systems. ...

Downloads: 1 This Week

Last Update: 2026-03-18
See Project
4

MiniRAG

Making RAG Simpler with Small and Open-Sourced Language Models

MiniRAG is a lightweight retrieval-augmented generation tool designed to bring the benefits of RAG workflows to smaller datasets, edge environments, and constrained compute settings by simplifying embedding, indexing, and retrieval. It extracts text from documents, codes, or other structured inputs and converts them into embeddings using efficient models, then stores these vectors for fast nearest-neighbor search without requiring huge databases or separate vector servers. When a query is issued, MiniRAG retrieves the most relevant contexts and feeds them into a generative model to produce an answer that is grounded in the source material rather than hallucinated. ...

Downloads: 1 This Week

Last Update: 2026-02-03
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
5

MiniMind-V

"Big Model" trains a visual multimodal VLM with 26M parameters

MiniMind-V is an experimental open-source project that aims to train a very small multimodal vision–language model (VLM) from scratch with extremely low compute and cost, making research and experimentation accessible to more people. The repository showcases training workflows and code designed to produce a 26-million parameter model—including both image and text capabilities—using minimal resources in very little time, reflecting a trend toward democratizing AI research. MiniMind-V combines techniques from modern vision-language modeling but focuses on efficiency and simplicity so that individuals or small teams can explore multimodal learning without massive GPU clusters. ...

Downloads: 0 This Week

Last Update: 2026-01-21
See Project
6

Netflix Maestro

Netflix’s Workflow Orchestrator

...It was designed to support the demanding internal infrastructure of Netflix, where thousands of workflows must process massive volumes of data reliably and efficiently every day. The platform enables engineers and data scientists to define workflows using structured configuration files and execute tasks across diverse compute environments, including scripts, containers, and notebook environments. Maestro provides built-in mechanisms for retry logic, task scheduling, dependency management, and error handling, which are essential when orchestrating production-scale pipelines.

Downloads: 0 This Week

Last Update: 2026-05-01
See Project
7

OpenMythos

A theoretical reconstruction of the Claude Mythos architecture

...The architecture incorporates advanced techniques such as mixture-of-experts routing, adaptive computation time, and multiple attention mechanisms to dynamically allocate compute where needed. It is highly configurable through a centralized configuration system, allowing experimentation with different architectural parameters such as loop depth, attention type.

Downloads: 27 This Week

Last Update: 2026-04-27
See Project
8

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

...The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. On very compute-bound settings, it can reach up to ~660 TFLOPS on H800 SXM5 hardware, while in memory-bound configurations it can push memory throughput to ~3000 GB/s. The team regularly updates it with performance improvements; for example, a 2025 update claims 5 % to 15 % gains on compute-bound workloads while maintaining API compatibility.

Downloads: 0 This Week

Last Update: 2026-04-29
See Project
9

Feynman

The open source AI research agent

Feynman is a command-line AI research agent designed to automate complex research workflows by orchestrating multiple specialized agents that collaborate to gather, analyze, and synthesize information into structured outputs. It operates as a “Claude Code for research,” allowing users to input natural language queries and receive fully developed, source-grounded research briefs, literature reviews, or experimental analyses. The system is built around a multi-agent architecture that includes...

Downloads: 11 This Week

Last Update: 1 day ago
See Project
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
10

TurboQuant PyTorch

From-scratch PyTorch implementation of Google's TurboQuant

TurboQuant PyTorch is a specialized deep learning optimization framework designed to accelerate neural network inference and training through advanced quantization techniques within the PyTorch ecosystem. The project focuses on reducing the computational and memory footprint of models by converting floating-point representations into lower-precision formats while preserving performance. It provides tools for experimenting with different quantization strategies, enabling developers to balance...

Downloads: 2 This Week

Last Update: 2026-04-23
See Project
11

GLM-4.1V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

GLM-4.1V — often referred to as a smaller / lighter version of the GLM-V family — offers a more resource-efficient option for users who want multimodal capabilities without requiring large compute resources. Though smaller in scale, GLM-4.1V maintains competitive performance, particularly impressive on many benchmarks for models of its size: in fact, on a number of multimodal reasoning and vision-language tasks it outperforms some much larger models from other families. It represents a trade-off: somewhat reduced capacity compared to 4.5V or 4.6V, but with benefits in terms of speed, deployability, and lower hardware requirements — making it especially useful for developers experimenting locally, building lightweight agents, or deploying on limited infrastructure. ...

Downloads: 0 This Week

Last Update: 2026-04-06
See Project
12

MiniMax-M1

Open-weight, large-scale hybrid-attention reasoning model

...It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to support a native context length of 1 million tokens while using far fewer FLOPs than comparable reasoning models for very long generations. The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. ...

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
13

Multica

The open-source managed agents platform

...It introduces a paradigm where agents can be assigned tasks, participate in discussions, and autonomously execute work while reporting progress and blockers in real time. The system integrates with multiple AI coding tools and provides a unified interface for managing tasks, compute environments, and agent execution pipelines. It includes both a web interface and a CLI that connects local or cloud-based runtimes to the platform, enabling flexible deployment and scaling. Multica emphasizes collaboration between humans and AI by allowing agents to operate alongside developers in shared workspaces. It also supports reusable skill accumulation, meaning that solutions generated by agents can be reused across projects to improve efficiency over time.

Downloads: 5 This Week

Last Update: 21 hours ago
See Project
14

DeepSeek Coder

DeepSeek Coder: Let the Code Write Itself

DeepSeek-Coder is a series of code-specialized language models designed to generate, complete, and infill code (and mixed code + natural language) with high fluency in both English and Chinese. The models are trained from scratch on a massive corpus (~2 trillion tokens), of which about 87% is code and 13% is natural language. This dataset covers project-level code structure (not just line-by-line snippets), using a large context window (e.g. 16K) and a secondary fill-in-the-blank objective...

Downloads: 11 This Week

Last Update: 2025-11-11
See Project
15

GLM-4.5

GLM-4.5: Open-source LLM for intelligent agents by Z.ai

GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for...

1 Review

Downloads: 54 This Week

Last Update: 2026-02-01
See Project
16

Paddler

Open-source LLM load balancer and serving platform for hosting LLMs

...The system acts as a specialized load balancer and serving layer for language models, enabling organizations to run inference workloads without relying on external API providers. It supports running models locally through engines such as llama.cpp while distributing requests across multiple compute nodes to improve performance and reliability. The architecture is designed with privacy and cost control in mind, making it suitable for organizations that handle sensitive data or require predictable operational costs. Paddler also includes tools for monitoring, request buffering, and autoscaling integration so that deployments can adapt dynamically to changing workloads. ...

Downloads: 5 This Week

Last Update: 2026-04-30
See Project
17

The Pope Bot

Autonomous AI agent that you can configure and build

...It’s designed so that every action taken by the agent is logged as a git commit, giving users complete visibility into what the agent did, why it did it, and when, which makes actions auditable and reversible. The framework treats the repository itself as the agent’s “brain,” and GitHub Actions serve as the compute layer, enabling tasks to run securely without exposing sensitive API keys to the underlying AI. The system integrates with messaging platforms like Telegram, where users can interact with the bot, trigger actions, or receive notifications, and supports scheduling and automation through patterns of request handling.

Downloads: 5 This Week

Last Update: 2 days ago
See Project
18

Attention Residuals (AttnRes)

Drop-in replacement for standard residual connections in Transformers

Attention Residuals is a research-driven architectural innovation for transformer-based models that replaces traditional residual connections with an attention-based mechanism to improve information flow across layers. In standard transformers, residual connections simply sum outputs from previous layers, which can lead to uncontrolled growth of hidden states and dilution of early-layer information in deep networks. Attention Residuals introduces a learnable softmax attention mechanism that...

Downloads: 1 This Week

Last Update: 2026-03-18
See Project
19

VoxelMorph

Unsupervised Learning for Image Registration

...VoxelMorph approaches the problem using neural networks that learn to predict deformation fields that transform one image so that it aligns with another. Once the model has been trained, it can rapidly compute the transformation required to register new image pairs, significantly reducing computational time compared to classical registration algorithms. The framework supports both supervised and unsupervised learning approaches and is commonly used in medical imaging applications such as MRI alignment, anatomical analysis, and longitudinal studies.

Downloads: 2 This Week

Last Update: 2026-03-15
See Project
20

ANE Training

Training neural networks on Apple Neural Engine via APIs

...The repository implements a from-scratch transformer training pipeline capable of running both forward and backward passes on ANE hardware without relying on CoreML, Metal, or GPU acceleration. It explores the internal software stack of the Apple Neural Engine by interfacing with private classes such as _ANEClient and compiling custom compute graphs in the MIL format. The project includes performance benchmarks and kernel breakdowns that show how different components of the training loop are distributed between the ANE and CPU. It is primarily intended as a research and educational proof of concept rather than a production library, highlighting what is technically possible with undocumented hardware access.

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
21

bitnet.cpp

Official inference framework for 1-bit LLMs

...At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous compute infrastructure. The project’s focus on extreme quantization dramatically reduces memory footprint and energy consumption compared with traditional 16-bit or 32-bit LLMs, making it practical to deploy advanced language understanding and generation models on everyday machines. BitNet is built to scale across architectures, with configurable kernels and tiling strategies that adapt to different hardware, and it supports large models with impressive throughput even on modest resources.

Downloads: 3 This Week

Last Update: 2026-03-10
See Project
22

PyTorch3D

PyTorch3D is FAIR's library of reusable components for deep learning

...It’s designed to make it easy to build and train neural networks that work directly with 3D data such as meshes, point clouds, and implicit surfaces. The library provides fast GPU-accelerated implementations of rendering pipelines, transformations, rasterization, and lighting—making it possible to compute gradients through full 3D rendering processes. Researchers use it for tasks like shape generation, reconstruction, view synthesis, and visual reasoning. PyTorch3D also includes utilities for loading, transforming, and sampling 3D assets, so models can be trained end-to-end from 2D supervision or partial data. Its modular design allows easy extension—components like differentiable rasterizers, mesh blending, or signed distance field (SDF) modules can be swapped or combined to test new architectures quickly.

Downloads: 3 This Week

Last Update: 2025-11-27
See Project
23

StableSwarmUI

Multi-user UI for managing and running Stable Diffusion workflows tool

...It abstracts much of the complexity involved in running diffusion models by offering a structured environment for handling prompts, outputs, and processing queues. StableSwarmUI is built to work alongside backend systems that execute the actual image generation, allowing separation between user interaction and compute workloads. It also emphasizes scalability, making it useful for setups where multiple jobs need to be processed efficiently. Overall, it serves as a coordination layer for Stable Diffusion usage rather than a standalone model implementation.

Downloads: 2 This Week

Last Update: 2026-03-18
See Project
24

ds4.c

DeepSeek 4 Flash local inference engine for Metal

...Built as a native low-level implementation, it focuses on performance, reduced abstraction overhead, and direct integration with Apple GPU acceleration through Metal compute graphs. The project also supports streaming inference behavior and local API serving for integration with external tools and AI applications. Overall, ds4 represents a minimalist high-performance approach to running large language models locally without relying on heavyweight inference frameworks.

Downloads: 0 This Week

Last Update: 12 hours ago
See Project
25

COCOON

Confidential Compute Open Network, Decentralized AI Inference on TON

COCOON is a privacy-aware desktop client framework designed by the developers of Telegram to provide a modern, secure, and extensible environment for building messaging and communication applications. At its core, it combines native desktop performance with web-like flexibility, packing a renderer, UI components, and plugin architecture that allows developers to craft rich experiences similar to those found in native apps. Cocoon’s architecture prioritizes privacy and security, making it...

Downloads: 0 This Week

Last Update: 2026-04-10
See Project