Showing 22 open source projects for "requirements"

View related business solutions
  • Host LLMs in Production With On-Demand GPUs Icon
    Host LLMs in Production With On-Demand GPUs

    NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

    Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.
    Try Free
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 1
    ACE-Step 1.5

    ACE-Step 1.5

    The most powerful local music generation model

    ...It integrates cutting-edge generative techniques—such as diffusion-based synthesis combined with compressed autoencoders and lightweight transformer elements—to produce high-quality full-length music tracks with rapid inference times, capable of generating a complete song in seconds on modern GPUs while remaining efficient enough to run on consumer-grade hardware with minimal memory requirements. Beyond straightforward text-to-music synthesis, ACE-Step 1.5 enables flexible creative workflows, including tasks like cover generation, editing existing tracks, transforming vocals to background accompaniment, and stylistic personalization using low-rank adaptation from just a few example songs.
    Downloads: 75 This Week
    Last Update:
    See Project
  • 2
    llama.cpp

    llama.cpp

    Port of Facebook's LLaMA model in C/C++

    The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.
    Downloads: 177 This Week
    Last Update:
    See Project
  • 3
    ChatGLM-6B

    ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model

    ...It is optimized for dialogue and question answering with a balance between performance and deployability in consumer hardware settings. Support for quantized inference (INT4, INT8) to reduce GPU memory requirements. Automatic mode switching between precision/memory tradeoffs (full/quantized).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    rwkv.cpp

    rwkv.cpp

    INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

    Besides the usual FP32, it supports FP16, quantized INT4, INT5 and INT8 inference. This project is focused on CPU, but cuBLAS is also supported. RWKV is a novel large language model architecture, with the largest model in the family having 14B parameters. In contrast to Transformer with O(n^2) attention, RWKV requires only state from the previous step to calculate logits. This makes RWKV very CPU-friendly on large context lengths.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 5
    GLM-4.1V

    GLM-4.1V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    ...Though smaller in scale, GLM-4.1V maintains competitive performance, particularly impressive on many benchmarks for models of its size: in fact, on a number of multimodal reasoning and vision-language tasks it outperforms some much larger models from other families. It represents a trade-off: somewhat reduced capacity compared to 4.5V or 4.6V, but with benefits in terms of speed, deployability, and lower hardware requirements — making it especially useful for developers experimenting locally, building lightweight agents, or deploying on limited infrastructure. Given its open-source availability under the same project repository, it provides an accessible entry point for testing multimodal reasoning and building proof-of-concept applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice cloning — meaning it can mimic a target speaker’s voice from a short reference sample — making it versatile for multi-voice uses. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    HRM-Text

    HRM-Text

    1B text generation model based on the HRM architecture

    HRM-Text is a one-billion-parameter text generation model and pretraining framework based on the Hierarchical Reasoning Model architecture. It is designed to make foundation model pretraining more accessible by reducing compute and data requirements compared with traditional scaling-heavy approaches. The system combines hierarchical recurrent design, task-completion strengthening, and latent-space reasoning. Its training stack includes PrefixLM sequence packing, FlashAttention 3 kernels, PyTorch FSDP2, evaluation scripts, and checkpoint conversion tools. The repository supports reference pretraining runs for smaller and larger configurations, with Hopper-class GPUs expected for the attention path. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. The...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    Stable Diffusion Version 2

    Stable Diffusion Version 2

    High-Resolution Image Synthesis with Latent Diffusion Models

    ...The repository provides code for training and running Stable Diffusion-style models, instructions for installing dependencies (with notes about performance libraries like xformers), and guidance on hardware/driver requirements for efficient GPU inference and training. It’s organized as a practical, developer-focused toolkit: model code, scripts for inference, and examples for using memory-efficient attention and related optimizations are included so researchers and engineers can run or adapt the model for their own projects. The project sits within a larger ecosystem of Stability AI repositories (including inference-only reference implementations like SD3.5 and web UI projects) and the README points users toward compatible components, recommended CUDA/PyTorch versions.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Save Up to 91% on Cloud Compute With Spot VMs Icon
    Save Up to 91% on Cloud Compute With Spot VMs

    Automatic sustained-use discounts. One free VM per month. No negotiation needed.

    Run batch jobs at 60-91% off with Spot VMs. Long-running workloads get automatic discounts with sustained use.
    Try Free
  • 10
    Fara-7B

    Fara-7B

    An Efficient Agentic Model for Computer Use

    ...Rather than relying on ad-hoc or manual review processes, FARA enables organizations to profile AI behavior using standardized tests, metrics, and reporting templates, making evaluations reproducible and comparable over time. The framework supports plugin-based modules that can be tailored to industry-specific concerns or regulatory requirements, helping compliance teams, auditors, and engineers collaborate on shared assessment goals.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Qwen-2.5-VL

    Qwen-2.5-VL

    Qwen2.5-VL is the multimodal large language model series

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation (exceeding 8,000 tokens), and structured data comprehension, such as tables and JSON formats. They support context lengths up to 128,000 tokens and offer multilingual capabilities in over 29 languages, including Chinese, English, French, Spanish, and more. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    OpenAI Privacy Filter

    OpenAI Privacy Filter

    Bidirectional token-classification model for identifiable info

    ...It can run locally on standard hardware, ensuring that sensitive information never leaves the user’s environment and supporting privacy-first workflows. The system is fine-tunable, enabling adaptation to specific datasets or compliance requirements across industries. It identifies multiple categories of sensitive data such as names, emails, and credentials, replacing them with placeholders to preserve structure.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    HunyuanVideo-Avatar

    HunyuanVideo-Avatar

    Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model

    HunyuanVideo-Avatar is a multimodal diffusion transformer (MM-DiT) model by Tencent Hunyuan for animating static avatar images into dynamic, emotion-controllable, and multi-character dialogue videos, conditioned on audio. It addresses challenges of motion realism, identity consistency, and emotional alignment. Innovations include a character image injection module, an Audio Emotion Module for transferring emotion cues, and a Face-Aware Audio Adapter to isolate audio effects on faces,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Qwen2.5

    Qwen2.5

    Open source large language model by Alibaba

    Qwen2.5 is a series of large language models developed by the Qwen team at Alibaba Cloud, designed to enhance natural language understanding and generation across multiple languages. The models are available in various sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B parameters, catering to diverse computational requirements. Trained on a comprehensive dataset of up to 18 trillion tokens, Qwen2.5 models exhibit significant improvements in instruction following, long-text generation (exceeding 8,000 tokens), and structured data comprehension, such as tables and JSON formats. They support context lengths up to 128,000 tokens and offer multilingual capabilities in over 29 languages, including Chinese, English, French, Spanish, and more. ...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 15
    MediaPipe Face Detection

    MediaPipe Face Detection

    Detect faces in an image

    The MediaPipe Face Detection model is a high-performance, real-time face detection solution that uses machine learning to identify faces in images and video streams. It is optimized for mobile and embedded platforms, offering fast and accurate face detection while maintaining a small memory footprint. This model supports multiple face detections and is highly efficient, making it suitable for a variety of applications such as augmented reality, user authentication, and facial expression analysis.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    MoveNet

    MoveNet

    A CNN model that predicts human joints from RGB images of a person

    The MoveNet model is an efficient, real-time human pose estimation system designed for detecting and tracking keypoints of human bodies. It utilizes deep learning to accurately locate 17 key points across the body, providing precise tracking even with fast movements. Optimized for mobile and embedded devices, MoveNet can be integrated into applications for fitness tracking, augmented reality, and interactive systems.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    GLM-130B

    GLM-130B

    GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

    ...Trained on over 400 billion tokens (200B English, 200B Chinese), it achieves performance surpassing GPT-3 175B, OPT-175B, and BLOOM-176B on multiple benchmarks, while also showing significant improvements on Chinese datasets compared to other large models. The model supports efficient inference via INT8 and INT4 quantization, reducing hardware requirements from 8× A100 GPUs to as little as a single server with 4× RTX 3090s. Built on the SwissArmyTransformer (SAT) framework and compatible with DeepSpeed and FasterTransformer, it supports high-speed inference (up to 2.5× faster) and reproducible evaluation across 30+ benchmark tasks.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Mistral Small 4

    Mistral Small 4

    Model that fuses instruct, reasoning and agentic skills

    ...These models are part of the broader Mistral Small family, which is designed to deliver strong performance across a wide range of everyday AI tasks while maintaining relatively low latency and efficient deployment requirements. The collection reflects an evolution toward hybrid mixture-of-experts architectures that dynamically activate subsets of parameters during inference, allowing large models to remain computationally efficient. Mistral Small 4 models are built to handle tasks such as conversational AI, software development assistance, and reasoning-heavy problem solving, making them versatile tools for both developers and enterprise applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Nemotron 3 Nano

    Nemotron 3 Nano

    LL model providing reasoning and conversational capabilities

    NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is a mid-sized open large language model created by NVIDIA to provide strong reasoning and conversational capabilities while maintaining efficient deployment requirements. The model contains roughly 30 billion parameters and is designed to balance performance and computational efficiency, making it suitable for developers building AI applications that cannot run extremely large models. It is trained from scratch and built using a hybrid architecture that integrates Transformer attention layers with Mamba-style sequence modeling components inside a Mixture-of-Experts framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    translategemma-4b-it

    translategemma-4b-it

    Lightweight multimodal translation model for 55 languages

    ...With a compact ~5B parameter footprint and BF16 support, the model is designed to run efficiently on laptops, desktops, and private cloud infrastructure, making advanced translation accessible without heavy hardware requirements. TranslateGemma uses a structured chat template that enforces explicit source and target language codes, ensuring consistent, deterministic behavior and reducing ambiguity in multilingual pipelines. It integrates seamlessly with Hugging Face Transformers through pipelines or direct model initialization, supporting GPU acceleration and scalable deployment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    GLM-4.5-Air

    GLM-4.5-Air

    Compact hybrid reasoning language model for intelligent responses

    ...Open-sourced under the MIT license, it is commercially usable and integrates with transformers, vLLM, and SGLang inference frameworks. It includes FP8 variants for faster inference and reduced memory requirements. Despite its smaller size compared to full GLM-4.5, GLM-4.5-Air maintains high performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Laguna XS.2

    Laguna XS.2

    Open agentic coding model optimized for local deployment

    ...The model contains 33B total parameters with only 3B activated per token, allowing it to deliver strong coding performance while remaining efficient enough to run locally on modern consumer hardware. It uses a hybrid attention architecture that combines Sliding Window Attention and global attention layers, reducing memory requirements and improving inference speed. Laguna XS.2 supports native reasoning with interleaved thinking between tool calls, enabling more capable autonomous coding agents and multi-step workflows. The model features a 262K-token context window, preserved reasoning across interactions, FP8 KV-cache optimization, and compatibility with local deployment ecosystems such as Ollama and vLLM.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo