Showing 65 open source projects for "linux operating system"

View related business solutions
  • Add Two Lines of Code. Get Full APM. Icon
    Add Two Lines of Code. Get Full APM.

    AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

    Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely...
    Downloads: 137 This Week
    Last Update:
    See Project
  • 2
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3...
    Downloads: 87 This Week
    Last Update:
    See Project
  • 3
    Kitten TTS

    Kitten TTS

    State-of-the-art TTS model under 25MB

    KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 4
    MiniCPM-o

    MiniCPM-o

    A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

    MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    PaddleOCR

    PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle

    ...PaddleOCR is easy to install and easy to use on Windows, Linux, MacOS and other systems.
    Downloads: 42 This Week
    Last Update:
    See Project
  • 6
    VibeVoice

    VibeVoice

    Open-source multi-speaker long-form text-to-speech model

    VibeVoice-1.5B is Microsoft’s frontier open-source text-to-speech (TTS) model designed for generating expressive, long-form, multi-speaker conversational audio such as podcasts. Unlike traditional TTS systems, it excels in scalability, speaker consistency, and natural turn-taking for up to 90 minutes of continuous speech with as many as four distinct speakers. A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz,...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    Qwen3.5

    Qwen3.5

    Qwen3.5 is the large language model series developed by Qwen team

    Qwen3.5 is part of Alibaba’s Qwen family of large language and multimodal foundation models, designed to power advanced AI applications such as chatbots, coding assistants, and autonomous agents. The project represents a significant step toward “agentic AI,” meaning models that can reason through multi-step tasks and interact with external tools or environments rather than only generating text. Qwen3.5 builds on earlier Qwen generations by improving multilingual understanding, reasoning...
    Downloads: 47 This Week
    Last Update:
    See Project
  • 8
    Qwen3-VL

    Qwen3-VL

    Qwen3-VL, the multimodal large language model series by Alibaba Cloud

    Qwen3-VL is the latest multimodal large language model series from Alibaba Cloud’s Qwen team, designed to integrate advanced vision and language understanding. It represents a major upgrade in the Qwen lineup, with stronger text generation, deeper visual reasoning, and expanded multimodal comprehension. The model supports dense and Mixture-of-Experts (MoE) architectures, making it scalable from edge devices to cloud deployments, and is available in both instruction-tuned and...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...
    Downloads: 8 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    AlphaFold 3

    AlphaFold 3

    AlphaFold 3 inference pipeline

    AlphaFold 3, developed by Google DeepMind, is an advanced deep learning system for predicting biomolecular structures and interactions with exceptional accuracy. This repository provides the complete inference pipeline for running AlphaFold 3, though access to the model parameters is restricted and must be obtained directly from Google under specific terms of use. The system is designed for scientific research applications in structural biology, biochemistry, and bioinformatics, enabling...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    Hunyuan3D-2.1

    Hunyuan3D-2.1

    From Images to High-Fidelity 3D Assets

    ...Physically Based Rendering texture synthesis to model realistic material effects, including reflections, subsurface scattering, etc. Cross-platform support (MacOS, Windows, Linux) via Python / PyTorch, including diffusers-style APIs.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 12
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    TRIBE v2

    TRIBE v2

    A multimodal model for brain response prediction

    TRIBE v2 is a multimodal foundation model developed by Meta AI for predicting human brain activity from naturalistic stimuli such as video, audio, and text. It is designed for in-silico neuroscience, enabling researchers to model how the brain responds to complex real-world inputs. The system integrates state-of-the-art encoders—including LLaMA for text, V-JEPA for video, and Wav2Vec-BERT for audio—into a unified Transformer architecture. This combined representation is mapped onto the...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 14
    HunyuanWorld-Voyager

    HunyuanWorld-Voyager

    RGBD video generation model conditioned on camera input

    HunyuanWorld-Voyager is a next-generation video diffusion framework developed by Tencent-Hunyuan for generating world-consistent 3D scene videos from a single input image. By leveraging user-defined camera paths, it enables immersive scene exploration and supports controllable video synthesis with high realism. The system jointly produces aligned RGB and depth video sequences, making it directly applicable to 3D reconstruction tasks. At its core, Voyager integrates a world-consistent video...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 15
    CodexBar

    CodexBar

    Show usage stats for OpenAI Codex and Claude Code

    CodexBar is a lightweight macOS utility that displays real-time usage statistics for AI coding tools such as OpenAI Codex and Claude Code directly from the system menu bar. The application is designed to give developers quick visibility into token consumption and activity without requiring them to open web dashboards or log into provider portals. Built in Swift with a native macOS interface, it integrates seamlessly into the desktop environment and emphasizes minimal overhead. The tool is...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    TADA

    TADA

    Open Source Speech Language Model

    TADA is an open-source speech-language modeling framework designed to unify spoken audio and text representations within a single generative architecture. The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Claude Code SDK Python

    Claude Code SDK Python

    Python SDK for Claude Agent

    claude-code-sdk-python is the Python SDK for Claude Code, Anthropic’s agentic coding system. It provides abstractions to easily query Claude Code (with streaming support) and conduct interactive sessions. The SDK includes core client classes, asynchronous query functions, and support for custom tools and hooks within Claude sessions. It is designed to integrate with local Python workflows and allow developers to embed Claude Code capabilities directly in their applications or scripts. The...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    LeWorldModel

    LeWorldModel

    Official code base for LeWorldModel: Stable End-to-End Joint-Embedding

    LeWorldModel is a minimalist tiling window manager designed for the X11 windowing system, focusing on simplicity, performance, and efficient use of screen space. It provides automatic window tiling behavior, organizing application windows into structured layouts without requiring manual resizing or positioning. The project emphasizes a lightweight design, minimizing resource usage while maintaining responsiveness and stability. It is highly configurable through source code or configuration...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    SlowFast

    SlowFast

    Video understanding codebase from FAIR for reproducing video models

    SlowFast is a video understanding framework that captures both spatial semantics and temporal dynamics efficiently by processing video frames at two different temporal resolutions. The slow pathway encodes semantic context by sampling frames sparsely, while the fast pathway captures motion and fine temporal cues by operating on densely sampled frames with fewer channels. Together, these two pathways complement each other, allowing the network to model both appearance and motion without...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MiniMax-M2.5

    MiniMax-M2.5

    State of the art LLM and coding model

    MiniMax-M2.5 is a state-of-the-art foundation model extensively trained with reinforcement learning across hundreds of thousands of real-world environments. It delivers leading performance in coding, agentic tool use, search, and complex office workflows, achieving top benchmark scores such as 80.2% on SWE-Bench Verified and 76.3% on BrowseComp. Designed to reason efficiently and decompose tasks like an experienced architect, M2.5 plans features, structure, and system design before...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    fairseq2

    fairseq2

    FAIR Sequence Modeling Toolkit 2

    fairseq2 is a modern, modular sequence modeling framework developed by Meta AI Research as a complete redesign of the original fairseq library. Built from the ground up for scalability, composability, and research flexibility, fairseq2 supports a broad range of language, speech, and multimodal content generation tasks, including instruction fine-tuning, reinforcement learning from human feedback (RLHF), and large-scale multilingual modeling. Unlike the original fairseq—which evolved into a...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    DeepSeek Math

    DeepSeek Math

    Pushing the Limits of Mathematical Reasoning in Open Language Models

    DeepSeek-Math is DeepSeek’s specialized model (or dataset + evaluation) focusing on mathematical reasoning, symbolic manipulation, proof steps, and advanced quantitative problem solving. The repository is likely to include fine-tuning routines or task datasets (e.g. MATH, GSM8K, ARB), demonstration notebooks, prompt templates, and evaluation results on math benchmarks. The goal is to push DeepSeek’s performance in domains that require rigorous symbolic steps, calculus, linear algebra, number...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    FireRedTTS-2

    FireRedTTS-2

    Long-form streaming TTS system for multi-speaker dialogue generation

    FireRedTTS2 is a next-generation open-source text-to-speech (TTS) system focused on long-form, streaming speech synthesis for multi-speaker dialogue, delivering stable natural speech with context-aware prosody and reliable speaker transitions that support real-time and conversational applications. It features a specialized streaming speech tokenizer and a dual-transformer architecture that enables low latency and high-quality synthesis, making it suitable for interactive systems like...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB