Showing 247 open source projects for "linux-tools"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    SAM 3D Body

    SAM 3D Body

    Code for running inference with the SAM 3D Body Model 3DB

    SAM 3D Body is a promptable model for single-image full-body 3D human mesh recovery, designed to estimate detailed human pose and shape from just one RGB image. It reconstructs the full body, including feet and hands, using the Momentum Human Rig (MHR), a parametric mesh representation that decouples skeletal structure from surface shape for more accurate and interpretable results. The model is trained to be robust in diverse, in-the-wild conditions, so it handles varied clothing,...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. The...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 4
    DFlash

    DFlash

    Block Diffusion for Ultra-Fast Speculative Decoding

    DFlash is an open-source framework for ultra-fast speculative decoding using a lightweight block diffusion model to draft text in parallel with a target large language model, dramatically improving inference speed without sacrificing generation quality. It acts as a “drafter” that proposes likely continuations which the main model then verifies, enabling significant throughput gains compared to traditional autoregressive decoding methods that generate token by token. This approach has been...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 5
    ComfyUI-LTXVideo

    ComfyUI-LTXVideo

    LTX-Video Support for ComfyUI

    ComfyUI-LTXVideo is a bridge between ComfyUI’s node-based generative workflow environment and the LTX-Video multimedia processing framework, enabling creators to orchestrate complex video tasks within a visual graph paradigm. Instead of writing code to apply effects, transitions, edits, and data flows, users can assemble nodes that represent video inputs, transformations, and outputs, letting them prototype and automate video production pipelines visually. This integration empowers...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    LTX-Video

    LTX-Video

    Official repository for LTX-Video

    LTX-Video is a sophisticated multimedia processing framework from Lightricks designed to handle high-quality video editing, compositing, and transformation tasks with performance and scalability. It provides runtime components that efficiently decode, encode, and manipulate video streams, frame buffers, and audio tracks while exposing a rich API for building customized editing features like transitions, effects, color grading, and keyframe automation. The toolkit is built with both real-time...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    SAM 3D Objects

    SAM 3D Objects

    Models for object and human mesh reconstruction

    SAM 3D Objects is a foundation model that reconstructs full 3D geometry, texture, and spatial layout of objects and scenes from a single image. Given one RGB image and object masks (for example, from the Segment Anything family), it can generate a textured 3D mesh for each object, including pose and approximate scene layout. The model is specifically designed to be robust in real-world images with clutter, occlusions, small objects, and unusual viewpoints, where many earlier 3D-from-image...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Qwen

    Qwen

    The official repo of Qwen chat & pretrained large language model

    Qwen is a series of large language models developed by Alibaba Cloud, consisting of various pretrained versions like Qwen-1.8B, Qwen-7B, Qwen-14B, and Qwen-72B. These models, which range from smaller to larger configurations, are designed for a wide range of natural language processing tasks. They are openly available for research and commercial use, with Qwen's code and model weights shared on GitHub. Qwen's capabilities include text generation, comprehension, and conversation, making it a...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    OpenAI Privacy Filter

    OpenAI Privacy Filter

    Bidirectional token-classification model for identifiable info

    OpenAI Privacy Filter is an open-weight machine learning model designed to detect and mask personally identifiable information in text with high efficiency and contextual awareness. It operates as a bidirectional token classification system that labels sensitive data in a single forward pass rather than generating text sequentially, enabling fast processing for large datasets. The model supports long-context inputs, allowing it to analyze extensive documents without chunking, which improves...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    SeedVR

    SeedVR

    Repo for SeedVR2 & SeedVR

    SeedVR (from the ByteDance-Seed organization) is an open-source research and implementation repository focused on cutting-edge video restoration using diffusion transformer architectures. The project includes both the original SeedVR and its successor SeedVR2 models, which are designed to restore degraded or low-quality video content by learning to reconstruct high-fidelity frames with temporal coherence. These models leverage advanced techniques such as adaptive attention mechanisms and...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    VOID

    VOID

    Video Object and Interaction Deletion

    VOID is an advanced AI video processing system developed by Netflix that focuses on removing objects from videos while preserving the physical and visual realism of the surrounding environment. Unlike traditional inpainting methods that only erase pixels or simple artifacts, VOID models the full interaction dynamics between objects and their environment, including shadows, reflections, and even physical consequences such as movement or balance changes. Built on top of transformer-based...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B),...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    ChatGPT Retrieval Plugin

    ChatGPT Retrieval Plugin

    The ChatGPT Retrieval Plugin lets you easily find personal documents

    The chatgpt-retrieval-plugin repository implements a semantic retrieval backend that lets ChatGPT (or GPT-powered tools) access private or organizational documents in natural language by combining vector search, embedding models, and plugin infrastructure. It can serve as a custom GPT plugin or function-calling backend so that a chat session can “look up” relevant documents based on user queries, inject those results into context, and respond more knowledgeably about a private knowledge base. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Qwen-Image-Layered

    Qwen-Image-Layered

    Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

    Qwen-Image-Layered is an extension of the Qwen series of multimodal models that introduces layered image understanding, enabling the model to reason about hierarchical visual structures — such as separating foreground, background, objects, and contextual layers within an image. This architecture allows richer semantic interpretation, enabling use cases such as scene decomposition, object-level editing, layered captioning, and more fine-grained multimodal reasoning than with flat image...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Depth Pro

    Depth Pro

    Sharp Monocular Metric Depth in Less Than a Second

    Depth Pro is a foundation model for zero-shot metric monocular depth estimation, producing sharp, high-frequency depth maps with absolute scale from a single image. Unlike many prior approaches, it does not require camera intrinsics or extra metadata, yet still outputs metric depth suitable for downstream 3D tasks. Apple highlights both accuracy and speed: the model can synthesize a ~2.25-megapixel depth map in around 0.3 seconds on a standard GPU, enabling near real-time applications. The...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    Lyra 2

    Lyra 2

    Project Lyra: Open Generative 3D World Models

    The Lyra 2 project is a research-driven framework developed by NVIDIA that focuses on building open generative 3D world models using advanced diffusion-based techniques. It enables the creation of fully explorable 3D environments from minimal inputs such as a single image or video, leveraging self-distillation methods to generate consistent spatial representations. The system evolves across versions, with newer iterations introducing long-horizon generation and improved 3D consistency across...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    NVIDIA Earth2Studio

    NVIDIA Earth2Studio

    Open-source deep-learning framework

    NVIDIA Earth2Studio is an open-source Python package and framework designed to accelerate the development and deployment of AI-driven weather and climate science workflows. It provides a unified API that lets researchers, data scientists, and engineers build complex forecasting and analysis pipelines by combining modular prognostic and diagnostic AI models with a diverse range of real-world data sources such as global forecast systems, reanalysis datasets, and satellite feeds. The toolkit...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    Kitten TTS

    Kitten TTS

    State-of-the-art TTS model under 25MB

    KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    DINOv2

    DINOv2

    PyTorch code and models for the DINOv2 self-supervised learning

    DINOv2 is a self-supervised vision learning framework that produces strong, general-purpose image representations without using human labels. It builds on the DINO idea of student–teacher distillation and adapts it to modern Vision Transformer backbones with a carefully tuned recipe for data augmentation, optimization, and multi-crop training. The core promise is that a single pretrained backbone can transfer well to many downstream tasks—from linear probing on classification to retrieval,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    GLM-4-Voice

    GLM-4-Voice

    GLM-4-Voice | End-to-End Chinese-English Conversational Model

    GLM-4-Voice is an open-source speech-enabled model from ZhipuAI, extending the GLM-4 family into the audio domain. It integrates advanced voice recognition and generation with the multimodal reasoning capabilities of GLM-4, enabling smooth natural interaction via spoken input and output. The model supports real-time speech-to-text transcription, spoken dialogue understanding, and text-to-speech synthesis, making it suitable for conversational AI, virtual assistants, and accessibility...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    HunyuanVideo-I2V

    HunyuanVideo-I2V

    A Customizable Image-to-Video Model based on HunyuanVideo

    HunyuanVideo-I2V is a customizable image-to-video generation framework from Tencent Hunyuan, built on their HunyuanVideo foundation. It extends video generation so that given a static reference image plus an optional prompt, it generates a video sequence that preserves the reference image’s identity (especially in the first frame) and allows stylized effects via LoRA adapters. The repository includes pretrained weights, inference and sampling scripts, training code for LoRA effects, and...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Claude Code Security Reviewer

    Claude Code Security Reviewer

    An AI-powered security review GitHub Action using Claude

    The claude-code-security-review repository implements a GitHub Action that uses Claude (via the Anthropic API) to perform semantic security audits of code changes in pull requests. Rather than relying purely on pattern matching or static analysis, this action feeds diffs and surrounding context to Claude to reason about potential vulnerabilities (e.g. injection, misconfigurations, secrets exposure, etc). When a PR is opened, the action analyzes only the changed files (diff-aware scanning),...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    GLM-V

    GLM-V

    GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning

    GLM-V is an open-source vision-language model (VLM) series from ZhipuAI that extends the GLM foundation models into multimodal reasoning and perception. The repository provides both GLM-4.5V and GLM-4.1V models, designed to advance beyond basic perception toward higher-level reasoning, long-context understanding, and agent-based applications. GLM-4.5V builds on the flagship GLM-4.5-Air foundation (106B parameters, 12B active), achieving state-of-the-art results on 42 benchmarks across image,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    llama.cpp Python Bindings

    llama.cpp Python Bindings

    Python bindings for llama.cpp

    llama-cpp-python provides Python bindings for llama.cpp, enabling the integration of LLaMA (Large Language Model Meta AI) language models into Python applications. This facilitates the use of LLaMA's capabilities in natural language processing tasks within Python environments.
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB