Showing 2659 open source projects for "gnu/linux"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 1
    ESPnet

    ESPnet

    End-to-end speech processing toolkit

    ESPnet is a comprehensive end-to-end speech processing toolkit covering a wide spectrum of tasks, including automatic speech recognition (ASR), text-to-speech (TTS), speech translation (ST), speech enhancement, speaker diarization, and spoken language understanding. It uses PyTorch as its deep learning engine and adopts a Kaldi-style data processing pipeline for features, data formats, and experimental recipes. This combination allows researchers to leverage modern neural architectures while...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Materials Discovery: GNoME

    Materials Discovery: GNoME

    AI discovers 520000 stable inorganic crystal structures for research

    Materials Discovery (GNoME) is a large-scale research initiative by Google DeepMind focused on applying graph neural networks to accelerate the discovery of stable inorganic crystal materials. The project centers on Graph Networks for Materials Exploration (GNoME), a message-passing neural network architecture trained on density functional theory (DFT) data to predict material stability and energy formation. Using GNoME, DeepMind identified 381,000 new stable materials, later expanding the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Mesh R-CNN

    Mesh R-CNN

    code for Mesh R-CNN, ICCV 2019

    Mesh R-CNN is a 3D reconstruction and object understanding framework developed by Facebook Research that extends Mask R-CNN into the 3D domain. Built on top of Detectron2 and PyTorch3D, Mesh R-CNN enables end-to-end 3D mesh prediction directly from single RGB images. The model learns to detect, segment, and reconstruct detailed 3D mesh representations of objects in natural images, bridging the gap between 2D perception and 3D understanding. Unlike voxel-based or point-based approaches, Mesh...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    vJEPA-2

    vJEPA-2

    PyTorch code and models for VJEPA2 self-supervised learning from video

    VJEPA2 is a next-generation self-supervised learning framework for video that extends the “predict in representation space” idea from i-JEPA to the temporal domain. Instead of reconstructing pixels, it predicts the missing high-level embeddings of masked space-time regions using a context encoder and a slowly updated target encoder. This objective encourages the model to learn semantics, motion, and long-range structure without the shortcuts that pixel-level losses can invite. The...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    Claude Code Security Reviewer

    Claude Code Security Reviewer

    An AI-powered security review GitHub Action using Claude

    The claude-code-security-review repository implements a GitHub Action that uses Claude (via the Anthropic API) to perform semantic security audits of code changes in pull requests. Rather than relying purely on pattern matching or static analysis, this action feeds diffs and surrounding context to Claude to reason about potential vulnerabilities (e.g. injection, misconfigurations, secrets exposure, etc). When a PR is opened, the action analyzes only the changed files (diff-aware scanning),...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    CogVLM2

    CogVLM2

    GPT4V-level open-source multi-modal model based on Llama3-8B

    CogVLM2 is the second generation of the CogVLM vision-language model series, developed by ZhipuAI and released in 2024. Built on Meta-Llama-3-8B-Instruct, CogVLM2 significantly improves over its predecessor by providing stronger performance across multimodal benchmarks such as TextVQA, DocVQA, and ChartQA, while introducing extended context length support of up to 8K tokens and high-resolution image input up to 1344×1344. The series includes models for both image understanding and video...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    GLM-V

    GLM-V

    GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning

    GLM-V is an open-source vision-language model (VLM) series from ZhipuAI that extends the GLM foundation models into multimodal reasoning and perception. The repository provides both GLM-4.5V and GLM-4.1V models, designed to advance beyond basic perception toward higher-level reasoning, long-context understanding, and agent-based applications. GLM-4.5V builds on the flagship GLM-4.5-Air foundation (106B parameters, 12B active), achieving state-of-the-art results on 42 benchmarks across image,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    deep-q-learning

    deep-q-learning

    Minimal Deep Q Learning (DQN & DDQN) implementations in Keras

    The deep-q-learning repository authored by keon provides a Python-based implementation of the Deep Q-Learning algorithm — a cornerstone method in reinforcement learning. It implements the core logic needed to train an agent using Q-learning with neural networks (i.e. approximating Q-values via deep nets), setting up environment interaction loops, experience replay, network updates, and policy behavior. For learners and researchers interested in reinforcement learning, this repo offers a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    LatentSync

    LatentSync

    Taming Stable Diffusion for Lip Sync

    LatentSync is an open-source framework from ByteDance that produces high-quality lip-synchronization for video by using an audio-conditioned latent diffusion model, bypassing traditional intermediate motion representations. In effect, given a source video (with masked or reference frames) and an audio track, LatentSync directly generates frames whose lip motions and expressions align with the audio, producing convincing talking-head or animated lip-sync output. The system leverages a U-Net...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 10
    Open Vision Agents by Stream

    Open Vision Agents by Stream

    Build Vision Agents quickly with any model or video provider

    Open Vision Agents by Stream is an open source framework from Stream for building real time, multimodal AI agents that watch, listen, and respond to live video streams. It focuses on combining video understanding models, such as YOLO and Roboflow based detectors, with real time large language models like OpenAI Realtime and Gemini Live to create interactive experiences. The framework uses Stream’s ultra low latency edge network so agents can join sessions quickly and maintain very low audio...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    VibeVoice ComfyUI

    VibeVoice ComfyUI

    ComfyUI integration for Microsoft's VibeVoice text-to-speech model

    VibeVoice ComfyUI is a comprehensive wrapper that integrates Microsoft’s VibeVoice text-to-speech models directly into ComfyUI workflows. It exposes VibeVoice as a set of custom nodes so you can build single-speaker and multi-speaker voice generation pipelines visually, combining TTS with other audio or generative components. The integration supports high-quality single-speaker synthesis as well as scripted multi-speaker conversations, with optional voice cloning from audio samples for each...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    CogVLM

    CogVLM

    A state-of-the-art open visual language model

    CogVLM is an open-source visual–language model suite—and its GUI-oriented sibling CogAgent—aimed at image understanding, grounding, and multi-turn dialogue, with optional agent actions on real UI screenshots. The flagship CogVLM-17B combines ~10B visual parameters with ~7B language parameters and supports 490×490 inputs; CogAgent-18B extends this to 1120×1120 and adds plan/next-action outputs plus grounded operation coordinates for GUI tasks. The repo provides multiple ways to run models...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    VisualGLM-6B

    VisualGLM-6B

    Chinese and English multimodal conversational language model

    VisualGLM-6B is an open-source multimodal conversational language model developed by ZhipuAI that supports both images and text in Chinese and English. It builds on the ChatGLM-6B backbone, with 6.2 billion language parameters, and incorporates a BLIP2-Qformer visual module to connect vision and language. In total, the model has 7.8 billion parameters. Trained on a large bilingual dataset — including 30 million high-quality Chinese image-text pairs from CogView and 300 million English pairs...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Tongyi DeepResearch

    Tongyi DeepResearch

    Tongyi Deep Research, the Leading Open-source Deep Research Agent

    DeepResearch (Tongyi DeepResearch) is an open-source “deep research agent” developed by Alibaba’s Tongyi Lab designed for long-horizon, information-seeking tasks. It’s built to act like a research agent: synthesizing, reasoning, retrieving information via the web and documents, and backing its outputs with evidence. The model is about 30.5 billion parameters in size, though at any given token only ~3.3B parameters are active. It uses a mix of synthetic data generation, fine-tuning and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    AWS MCP Servers

    AWS MCP Servers

    Helping you get the most out of AWS, wherever you use MCP

    AWS MCP Servers are a collection of remotely hosted, fully-managed Model Context Protocol (MCP) servers by AWS, providing AI applications with real-time access to AWS documentation, API references, best practices, and infrastructure-management capabilities via natural-language workflows. An MCP Server is a lightweight program that exposes specific capabilities through the standardized Model Context Protocol. Host applications (such as chatbots, IDEs, and other AI tools) have MCP clients that...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    DeepSpeed MII

    DeepSpeed MII

    MII makes low-latency and high-throughput inference possible

    MII makes low-latency and high-throughput inference possible, powered by DeepSpeed. The Deep Learning (DL) open-source community has seen tremendous growth in the last few months. Incredibly powerful text generation models such as the Bloom 176B, or image generation model such as Stable Diffusion are now available to anyone with access to a handful or even a single GPU through platforms such as Hugging Face. While open-sourcing has democratized access to AI capabilities, their application is...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    UForm

    UForm

    Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion

    UForm is a Multi-Modal Modal Inference package, designed to encode Multi-Lingual Texts, Images, and, soon, Audio, Video, and Documents, into a shared vector space! It comes with a set of homonymous pre-trained networks available on HuggingFace portal and extends the transfromers package to support Mid-fusion Models. Late-fusion models encode each modality independently, but into one shared vector space. Due to independent encoding late-fusion models are good at capturing coarse-grained...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    DeepPavlov

    DeepPavlov

    A library for deep learning end-to-end dialog systems and chatbots

    DeepPavlov makes it easy for beginners and experts to create dialogue systems. The best place to start is with user-friendly tutorials. They provide quick and convenient introduction on how to use DeepPavlov with complete, end-to-end examples. No installation needed. Guides explain the concepts and components of DeepPavlov. Follow step-by-step instructions to install, configure and extend DeepPavlov framework for your use case. DeepPavlov is an open-source framework for chatbots and virtual...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Python Outlier Detection

    Python Outlier Detection

    A Python toolbox for scalable outlier detection

    PyOD is a comprehensive and scalable Python toolkit for detecting outlying objects in multivariate data. This exciting yet challenging field is commonly referred as outlier detection or anomaly detection. PyOD includes more than 30 detection algorithms, from classical LOF (SIGMOD 2000) to the latest COPOD (ICDM 2020) and SUOD (MLSys 2021). Since 2017, PyOD [AZNL19] has been successfully used in numerous academic researches and commercial products [AZHC+21, AZNHL19]. PyOD has multiple neural...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Step1X-3D

    Step1X-3D

    High-Fidelity and Controllable Generation of Textured 3D Assets

    Step1X-3D is an open-source framework for generating high-fidelity textured 3D assets from scratch — both their geometry and surface textures — using modern generative AI techniques. It combines a hybrid architecture: a geometry generation stage using a VAE-DiT model to output a watertight 3D representation (e.g. TSDF surface), and a texture synthesis stage that conditions on geometry and optionally reference input (or prompts) to produce view-consistent textures using a diffusion-based...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Step-Video-T2V

    Step-Video-T2V

    State-of-the-art (SoTA) text-to-video pre-trained model

    Step-Video-T2V is a state-of-the-art text-to-video foundation model developed to generate videos from natural-language prompts; its 30B-parameter architecture is designed to produce coherent, temporally extended video sequences — up to around 204 frames — based on input text. Under the hood it uses a compressed latent representation (a Video-VAE) to reduce spatial and temporal redundancy, and a denoising diffusion (or similar) process over that latent space to generate smooth, plausible...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    HunyuanOCR

    HunyuanOCR

    OCR expert VLM powered by Hunyuan's native multimodal architecture

    HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB