Showing 50 open source projects for "hardware"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    FlashMLA

    FlashMLA

    FlashMLA: Efficient Multi-head Latent Attention Kernels

    ...The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. On very compute-bound settings, it can reach up to ~660 TFLOPS on H800 SXM5 hardware, while in memory-bound configurations it can push memory throughput to ~3000 GB/s. The team regularly updates it with performance improvements; for example, a 2025 update claims 5 % to 15 % gains on compute-bound workloads while maintaining API compatibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Step1X-Edit

    Step1X-Edit

    A SOTA open-source image editing model

    Step1X-Edit is a state-of-the-art open-source image editing model/framework that uses a multimodal large language model (LLM) together with a diffusion-based image decoder to let users edit images simply via natural-language instructions plus a reference image. You supply an existing image and a textual command — e.g. “add a ruby pendant on the girl’s neck” or “make the background a sunset over mountains” — and the model interprets the instruction, computes a latent embedding combining the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    ChatGLM Efficient Tuning

    ChatGLM Efficient Tuning

    Fine-tuning ChatGLM-6B with PEFT

    ChatGLM-Efficient-Tuning is a hands-on toolkit for fine-tuning ChatGLM-family models with parameter-efficient methods on everyday hardware. It wraps techniques like LoRA and prompt-tuning into simple training scripts so you can adapt a large model to your domain without full retraining. The project exposes practical switches for quantization and mixed precision, allowing bigger models to fit into limited VRAM. It includes examples for instruction tuning and dialogue datasets, making it straightforward to stand up a task-specific assistant. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    GLM-130B

    GLM-130B

    GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

    ...Trained on over 400 billion tokens (200B English, 200B Chinese), it achieves performance surpassing GPT-3 175B, OPT-175B, and BLOOM-176B on multiple benchmarks, while also showing significant improvements on Chinese datasets compared to other large models. The model supports efficient inference via INT8 and INT4 quantization, reducing hardware requirements from 8× A100 GPUs to as little as a single server with 4× RTX 3090s. Built on the SwissArmyTransformer (SAT) framework and compatible with DeepSpeed and FasterTransformer, it supports high-speed inference (up to 2.5× faster) and reproducible evaluation across 30+ benchmark tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 5
    ConvNeXt V2

    ConvNeXt V2

    Code release for ConvNeXt V2 model

    ...A key innovation is a new Global Response Normalization (GRN) layer added to the ConvNeXt backbone, which enhances feature competition across channels. The result is a convnet that competes strongly with transformer architectures on recognition benchmarks while being efficient and hardware-friendly. The repository provides official PyTorch implementations for multiple model sizes (Atto, Femto, Pico, up through Huge), conversion from JAX weights, code for pretraining/fine-tuning, and pretrained checkpoints. It supports both self-supervised pretraining and supervised fine-tuning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Stable Diffusion

    Stable Diffusion

    A latent text-to-image diffusion model

    ...It was trained on large-scale image datasets and later fine-tuned to produce 512×512 images with strong visual fidelity. Because the system runs efficiently on consumer hardware compared to earlier generative models, it helped popularize local AI image generation workflows. The repository includes reference scripts and model configurations that allow researchers and developers to reproduce, modify, or extend the architecture. Overall, stable-diffusion has become a foundational tool in the generative AI ecosystem for art creation, research, and multimodal experimentation.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    Laguna XS.2

    Laguna XS.2

    Open agentic coding model optimized for local deployment

    ...The model contains 33B total parameters with only 3B activated per token, allowing it to deliver strong coding performance while remaining efficient enough to run locally on modern consumer hardware. It uses a hybrid attention architecture that combines Sliding Window Attention and global attention layers, reducing memory requirements and improving inference speed. Laguna XS.2 supports native reasoning with interleaved thinking between tool calls, enabling more capable autonomous coding agents and multi-step workflows. The model features a 262K-token context window, preserved reasoning across interactions, FP8 KV-cache optimization, and compatibility with local deployment ecosystems such as Ollama and vLLM.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Command A+

    Command A+

    4-bit Command A+ model for enterprise agents and multilingual tasks

    ...The W4A4 release applies 4-bit weight and activation quantization mainly to MoE experts, preserving attention components at full precision to reduce quality loss while improving speed, latency, and hardware efficiency. Cohere recommends W4A4 for most users because it offers a smaller hardware footprint with negligible benchmark differences compared to BF16 and FP8 versions. The model supports a 128K input context and 64K output length, covers 48 languages, and includes conversational tool-use capabilities with JSON-schema tools and optional citation grounding.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Mistral Large 3 675B Base 2512

    Mistral Large 3 675B Base 2512

    Frontier-scale 675B multimodal base model for custom AI training

    ...Its architecture includes a powerful language MoE and a 2.5B-parameter vision encoder, enabling multimodal understanding out of the box. Mistral Large 3 Base supports deployment on-premises using FP8 or NVFP4 formats, enabling high-performance workflows on B200, H200, H100, or A100 hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    SuperGemma4

    SuperGemma4

    Fast uncensored Gemma model optimized for local chat and coding

    ...Unlike raw base models, it inherits improvements from the SuperGemma Fast line, resulting in better performance in logic, coding, and real-world text workflows. The model is packaged in GGUF format for efficient use with llama.cpp and has been specifically tested on Apple Silicon hardware, delivering high token speeds and smooth local inference. A neutral chat template is embedded to prevent prompt misrouting issues, ensuring consistent responses without unintended shifts into coding or tool-use modes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    translategemma-4b-it

    translategemma-4b-it

    Lightweight multimodal translation model for 55 languages

    ...With a compact ~5B parameter footprint and BF16 support, the model is designed to run efficiently on laptops, desktops, and private cloud infrastructure, making advanced translation accessible without heavy hardware requirements. TranslateGemma uses a structured chat template that enforces explicit source and target language codes, ensuring consistent, deterministic behavior and reducing ambiguity in multilingual pipelines. It integrates seamlessly with Hugging Face Transformers through pipelines or direct model initialization, supporting GPU acceleration and scalable deployment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    gpt-oss-20b

    gpt-oss-20b

    OpenAI’s compact 20B open model for fast, agentic, and local use

    ...GPT-OSS-20B is compatible with Transformers, vLLM, Ollama, PyTorch, and other tools. It is ideal for developers building lightweight AI agents or experimenting with fine-tuning on consumer-grade hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    gpt-oss-120b

    gpt-oss-120b

    OpenAI’s open-weight 120B model optimized for reasoning and tooling

    GPT-OSS-120B is a powerful open-weight language model by OpenAI, optimized for high-level reasoning, tool use, and agentic tasks. With 117B total parameters and 5.1B active parameters, it’s designed to fit on a single H100 GPU using native MXFP4 quantization. The model supports fine-tuning, chain-of-thought reasoning, and structured outputs, making it ideal for complex workflows. It operates in OpenAI’s Harmony response format and can be deployed via Transformers, vLLM, Ollama, LM Studio,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Rampart

    Rampart

    Lightweight on-device model for private AI text redaction

    Rampart is a lightweight, on-device privacy protection model developed by the National Design Studio to detect and redact personally identifiable information (PII) before text leaves a user's device. Rather than relying on server-side filtering, Rampart performs token-level PII detection locally, enabling privacy-preserving AI interactions with minimal latency and without exposing sensitive information to external services. The released model is a 14.7 MB ONNX artifact based on a fine-tuned...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ZAYA1-8B

    ZAYA1-8B

    Efficient MoE reasoning model for coding and math workloads

    ZAYA1-8B is a compact Mixture-of-Experts reasoning model developed by Zyphra, designed to deliver unusually high intelligence density with fewer than 1 billion active parameters. The model contains 8.4B total parameters with around 760M active during inference, allowing it to achieve strong reasoning, mathematics, and coding performance while remaining lightweight enough for efficient local or on-device deployment. ZAYA1-8B is optimized for long-form reasoning and test-time compute...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Ministral 3 8B Instruct 2512

    Ministral 3 8B Instruct 2512

    Compact 8B multimodal instruct model optimized for edge deployment

    ...This FP8 instruct-fine-tuned variant is optimized for chat, instruction following, and structured outputs, making it ideal for daily assistant tasks and lightweight agentic workflows. Designed for edge deployment, the model can run on a wide range of hardware and fits locally on a single 12GB GPU, with the option for even smaller quantized configurations. Its multilingual support covers dozens of major languages, allowing it to work across diverse global environments and applications. The model adheres reliably to system prompts, supports native function calling, and outputs clean JSON, giving it strong tool-use behavior.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    GigaChat 3 Ultra

    GigaChat 3 Ultra

    High-performance MoE model with MLA, MTP, and multilingual reasoning

    ...This combination significantly boosts reasoning, coding, and multilingual performance across modern benchmarks. Designed for high-performance deployment, GigaChat 3 Ultra supports major inference engines and offers optimized BF16 and FP8 execution paths for cluster-grade hardware.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Mistral Large 3 675B Instruct 2512 NVFP4

    Mistral Large 3 675B Instruct 2512 NVFP4

    Quantized 675B multimodal instruct model optimized for NVFP4

    Mistral Large 3 675B Instruct 2512 NVFP4 is a frontier-scale multimodal Mixture-of-Experts model featuring 675B total parameters and 41B active parameters, trained from scratch on 3,000 H200 GPUs. This NVFP4 checkpoint is a post-training-activation quantized version of the original instruct model, created through a collaboration between Mistral AI, vLLM, and Red Hat using llm-compressor. It retains the same instruction-tuned behavior as the FP8 model, making it ideal for production...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Ministral 3 3B Base 2512

    Ministral 3 3B Base 2512

    Small 3B-base multimodal model ideal for custom AI on edge hardware

    Ministral 3 3B Base 2512 is the smallest model in the Ministral 3 family, offering a compact yet capable multimodal architecture suited for lightweight AI applications. It combines a 3.4B-parameter language model with a 0.4B vision encoder, enabling both text and image understanding in a tiny footprint. As the base pretrained model, it is not fine-tuned for instructions or reasoning, making it the ideal foundation for custom post-training, domain adaptation, or specialized downstream tasks....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Ministral 3 14B Reasoning 2512

    Ministral 3 14B Reasoning 2512

    High-precision 14B multimodal model built for advanced reasoning tasks

    Ministral 3 14B Reasoning 2512 is the largest model in the Ministral 3 series, delivering frontier-level performance with capabilities comparable to the Mistral Small 3.2 24B model. It pairs a 13.5B-parameter language model with a 0.4B vision encoder, enabling strong multimodal reasoning across both text and images. This version is specifically post-trained for reasoning tasks, making it highly effective for math, coding, STEM workloads, and complex multi-step problem-solving. Despite its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Ministral 3 14B Instruct 2512

    Ministral 3 14B Instruct 2512

    Efficient 14B multimodal instruct model with edge deployment and FP8

    Ministral 3 14B Instruct 2512 is the largest model in the Ministral 3 family, delivering frontier performance comparable to much larger systems while remaining optimized for edge-level deployment. It combines a 13.5B-parameter language model with a 0.4B-parameter vision encoder, enabling strong multimodal understanding in both text and image tasks. This FP8 instruct-tuned variant is designed specifically for chat, instruction following, and agentic workflows with robust system-prompt...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Mistral Large 3 675B Instruct 2512

    Mistral Large 3 675B Instruct 2512

    Frontier-scale 675B multimodal instruct MoE model for enterprise AIMis

    ...The model supports dozens of languages and maintains strong system-prompt adherence, making it suitable for global and structured enterprise use. Designed for high performance, it runs on a single node of B200 or H200 GPUs in FP8, and can also operate in NVFP4 mode on H100 or A100 hardware. With a 256k context window, it excels at long-document comprehension, deep retrieval workflows, and complex knowledge-intensive tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Ministral 3 3B Reasoning 2512

    Ministral 3 3B Reasoning 2512

    Compact 3B-param multimodal model for efficient on-device reasoning

    Ministral 3 3B Reasoning 2512 is the smallest reasoning-capable model in the Ministal-3 family, yet delivers a surprisingly capable multimodal and multilingual base for lightweight AI applications. It pairs a 3.4B-parameter language model with a 0.4B-parameter vision encoder, enabling it to understand both text and image inputs. This reasoning-tuned variant is optimized for tasks like math, coding, and other STEM-related problem solving, making it suitable for applications that require...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Ministral 3 8B Base 2512

    Ministral 3 8B Base 2512

    Versatile 8B-base multimodal LLM, flexible foundation for custom AI

    ...The model supports a large 256k token context window, making it capable of handling long documents or extended dialogues. Because it comes from the edge-optimized Ministral 3 family, it remains deployable on reasonably powerful hardware while offering a good balance between capability and resource use. Its multilingual and multimodal pretraining enables broad applicability across languages and tasks — from generation to classification to vision-language tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Ministral 3 14B Base 2512

    Ministral 3 14B Base 2512

    Powerful 14B-base multimodal model — flexible base for fine-tuning

    Ministral 3 14B Base 2512 is the largest model in the Ministral 3 line, offering state-of-the-art language and vision capabilities in a dense, base-pretrained form. It combines a 13.5B-parameter language model with a 0.4B-parameter vision encoder, enabling both high-quality text understanding/generation and image-aware tasks. As a “base” model (i.e. not fine-tuned for instruction or reasoning), it provides a flexible foundation ideal for custom fine-tuning or downstream specialization. The...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo