AI Models for BSD

Browse free open source AI Models and projects for BSD below. Use the toggles on the left to filter open source AI Models by OS, license, language, programming language, and project status.

  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    llama.cpp

    llama.cpp

    Port of Facebook's LLaMA model in C/C++

    The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.
    Downloads: 134 This Week
    Last Update:
    See Project
  • 2
    GLM-4.6

    GLM-4.6

    Agentic, Reasoning, and Coding (ARC) foundation models

    GLM-4.6 is the latest iteration of Zhipu AI’s foundation model, delivering significant advancements over GLM-4.5. It introduces an extended 200K token context window, enabling more sophisticated long-context reasoning and agentic workflows. The model achieves superior coding performance, excelling in benchmarks and practical coding assistants such as Claude Code, Cline, Roo Code, and Kilo Code. Its reasoning capabilities have been strengthened, including improved tool usage during inference and more effective integration within agent frameworks. GLM-4.6 also enhances writing quality, producing outputs that better align with human preferences and role-playing scenarios. Benchmark evaluations demonstrate that it not only outperforms GLM-4.5 but also rivals leading global models such as DeepSeek-V3.1-Terminus and Claude Sonnet 4.
    Downloads: 132 This Week
    Last Update:
    See Project
  • 3
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3 underwent supervised fine-tuning and reinforcement learning to fully realize its capabilities. Evaluations indicate that it outperforms other open-source models and rivals leading closed-source models, achieving this with a training duration of 55 days on 2,048 Nvidia H800 GPUs, costing approximately $5.58 million.
    Downloads: 87 This Week
    Last Update:
    See Project
  • 4
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while simultaneously producing corresponding audio elements such as speech, music, ambient sound, or effects. This unified approach allows creators to generate complete multimedia sequences where motion, timing, and sound are aligned automatically. LTX-2 is designed for both research and production workflows and can generate high-resolution video clips with precise control over structure, motion, and camera behavior.
    Downloads: 87 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    DeepSeek-R1 is an open-source large language model developed by DeepSeek, designed to excel in complex reasoning tasks across domains such as mathematics, coding, and language. DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely integrates large-scale reinforcement learning (RL) without relying on supervised fine-tuning, enabling the model to develop advanced reasoning capabilities. This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. To further support the research community, DeepSeek has released distilled versions of the model based on architectures such as LLaMA and Qwen.
    Downloads: 86 This Week
    Last Update:
    See Project
  • 6
    GLM-5

    GLM-5

    From Vibe Coding to Agentic Engineering

    GLM-5 is a next-generation open-source large language model (LLM) developed by the Z .ai team under the zai-org organization that pushes the boundaries of reasoning, coding, and long-horizon agentic intelligence. Building on earlier GLM series models, GLM-5 dramatically scales the parameter count (to roughly 744 billion) and expands pre-training data to significantly improve performance on complex tasks such as multi-step reasoning, software engineering workflows, and agent orchestration compared to its predecessors like GLM-4.5. It incorporates innovations like DeepSeek Sparse Attention (DSA) to preserve massive context windows while reducing deployment costs and supporting long context processing, which is crucial for detailed plans and agent tasks.
    Downloads: 50 This Week
    Last Update:
    See Project
  • 7
    DeepSeek Coder V2

    DeepSeek Coder V2

    DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models

    DeepSeek-Coder-V2 is the version-2 iteration of DeepSeek’s code generation models, refining the original DeepSeek-Coder line with improved architecture, training strategies, and benchmark performance. While the V1 models already targeted strong code understanding and generation, V2 appears to push further in both multilingual support and reasoning in code, likely via architectural enhancements or additional training objectives. The repository provides updated model weights, evaluation results on benchmarks (e.g. HumanEval, MultiPL-E, APPS), and new inference/serving scripts. Compared to the original, DeepSeek-Coder-V2 likely incorporates improved context management, caching strategies, or enhanced infilling capabilities. The project aims to provide a more performant and reliable open-source alternative to closed-source code models, optimized for practical usage in code completion, infilling, and code understanding across English and Chinese codebases.
    Downloads: 44 This Week
    Last Update:
    See Project
  • 8
    FLUX.1

    FLUX.1

    Official inference repo for FLUX.1 models

    FLUX.1 repository contains inference code and tooling for the FLUX.1 text-to-image diffusion models, enabling developers and researchers to generate and edit images from natural-language prompts using open-weight versions of the model on their own hardware or within custom applications. The project is part of a larger family of FLUX models developed by Black Forest Labs, designed to produce high-quality, detailed visuals from text descriptions with competitive prompt adherence and artistic fidelity. This repo focuses on running the open-source model variants efficiently, providing scripts, model loading logic, and examples for local installations, and supports integration with Python toolchains like PyTorch and popular generative pipelines. Users can launch CLI tools to generate images, experiment with different FLUX variants, and extend the base code for research-oriented applications.
    Downloads: 41 This Week
    Last Update:
    See Project
  • 9
    SAM 3

    SAM 3

    Code for running inference and finetuning with SAM 3 model

    SAM 3 (Segment Anything Model 3) is a unified foundation model for promptable segmentation in both images and videos, capable of detecting, segmenting, and tracking objects. It accepts both text prompts (open-vocabulary concepts like “red car” or “goalkeeper in white”) and visual prompts (points, boxes, masks) and returns high-quality masks, boxes, and scores for the requested concepts. Compared with SAM 2, SAM 3 introduces the ability to exhaustively segment all instances of an open-vocabulary concept specified by a short phrase or exemplars, scaling to a vastly larger set of categories than traditional closed-set models. This capability is grounded in a new data engine that automatically annotated over four million unique concepts, producing a massive open-vocabulary segmentation dataset and enabling the model to achieve 75–80% of human performance on the SA-CO benchmark, which itself spans 270K unique concepts.
    Downloads: 37 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for immediate responses. They are released under the MIT license, allowing commercial use and secondary development. GLM-4.5 achieves strong performance on 12 industry-standard benchmarks, ranking 3rd overall, while GLM-4.5-Air balances competitive results with greater efficiency. The models support FP8 and BF16 precision, and can handle very large context windows of up to 128K tokens. Flexible inference is supported through frameworks like vLLM and SGLang with tool-call and reasoning parsers included.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 11
    DeepSeek V2

    DeepSeek V2

    Strong, Economical, and Efficient Mixture-of-Experts Language Model

    DeepSeek-V2 is the second major iteration of DeepSeek’s foundation language model (LLM) series. This version likely includes architectural improvements, training enhancements, and expanded dataset coverage compared to V1. The repository includes model weight artifacts, evaluation benchmarks across a broad suite (e.g. reasoning, math, multilingual), configuration files, and possibly tokenization / inference scripts. The V2 model is expected to support more advanced features like better context window handling, more efficient inference, better performance on challenging tasks, and stronger alignment with human feedback. Because DeepSeek is pushing open-weight competition, this V2 iteration is meant to solidify its position in benchmark rankings and in developer adoption. The code in the repository may include description files, support for tool use or plug-in architectures, and artifacts showing fine-tuning or prompt templates.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 12
    GLM-4.7

    GLM-4.7

    Advanced language and coding AI model

    GLM-4.7 is an advanced agent-oriented large language model designed as a high-performance coding and reasoning partner. It delivers significant gains over GLM-4.6 in multilingual agentic coding, terminal-based workflows, and real-world developer benchmarks such as SWE-bench and Terminal Bench 2.0. The model introduces stronger “thinking before acting” behavior, improving stability and accuracy in complex agent frameworks like Claude Code, Cline, and Roo Code. GLM-4.7 also advances “vibe coding,” producing cleaner, more modern UIs, better-structured webpages, and visually improved slide layouts. Its tool-use capabilities are substantially enhanced, with notable improvements in browsing, search, and tool-integrated reasoning tasks. Overall, GLM-4.7 shows broad performance upgrades across coding, reasoning, chat, creative writing, and role-play scenarios.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 13
    FLUX.2

    FLUX.2

    Official inference repo for FLUX.2 models

    FLUX.2 is a state-of-the-art open-weight image generation and editing model released by Black Forest Labs aimed at bridging the gap between research-grade capabilities and production-ready workflows. The model offers both text-to-image generation and powerful image editing, including editing of multiple reference images, with fidelity, consistency, and realism that push the limits of what open-source generative models have achieved. It supports high-resolution output (up to ~4 megapixels), which allows for photography-quality images, detailed product shots, infographics or UI mockups rather than just low-resolution drafts. FLUX.2 is built with a modern architecture (a flow-matching transformer + a revamped VAE + a strong vision-language encoder), enabling strong prompt adherence, correct rendering of text/typography in images, reliable lighting, layout, and physical realism, and consistent style/character/product identity across multiple generations or edits.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 14
    Kimi K2

    Kimi K2

    Kimi K2 is the large language model series developed by Moonshot AI

    Kimi K2 is Moonshot AI’s advanced open-source large language model built on a scalable Mixture-of-Experts (MoE) architecture that combines a trillion total parameters with a subset of ~32 billion active parameters to deliver powerful and efficient performance on diverse tasks. It was trained on an enormous corpus of over 15.5 trillion tokens to push frontier capabilities in coding, reasoning, and general agentic tasks while addressing training stability through novel optimizer and architecture design strategies. The model family includes variants like a foundational base model that researchers can fine-tune for specific use cases and an instruct-optimized variant primed for general-purpose chat and agent-style interactions, offering flexibility for both experimentation and deployment. With its high-dimensional attention mechanisms and expert routing, Kimi-K2 excels across benchmarks in live coding, math reasoning, and problem solving.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 15
    Z-Image

    Z-Image

    Image generation model with single-stream diffusion transformer

    Z-Image is an efficient, open-source image generation foundation model built to make high-quality image synthesis more accessible. With just 6 billion parameters — far fewer than many large-scale models — it uses a novel “single-stream diffusion Transformer” architecture to deliver photorealistic image generation, demonstrating that excellence does not always require extremely large model sizes. The project includes several variants: Z-Image-Turbo, a distilled version optimized for speed and low resource consumption; Z-Image-Base, the full-capacity foundation model; and Z-Image-Edit, fine-tuned for image editing tasks. Despite its compact size, Z-Image produces outputs that closely rival those from much larger models — including strong rendering of bilingual (English and Chinese) text inside images, accurate prompt adherence, and good layout and composition.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 16
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    LTX-2 is a powerful, open-source toolkit developed by Lightricks that provides a modular, high-performance base for building real-time graphics and visual effects applications. It is architected to give developers low-level control over rendering pipelines, GPU resource management, shader orchestration, and cross-platform abstractions so they can craft visually compelling experiences without starting from scratch. Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries, resource loaders, utilities for texture and buffer handling, and integration points for native event loops and input systems. The framework targets both interactive graphical applications and media-rich experiences, making it a solid foundation for games, creative tools, or visualization systems that demand both performance and flexibility. While being low-level, it also provides sensible defaults and helper abstractions that reduce boilerplate and help teams maintain clear, maintainable code.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 17
    Stable Diffusion

    Stable Diffusion

    High-Resolution Image Synthesis with Latent Diffusion Models

    Stable Diffusion Version 2. The Stable Diffusion project, developed by Stability AI, is a cutting-edge image synthesis model that utilizes latent diffusion techniques for high-resolution image generation. It offers an advanced method of generating images based on text input, making it highly flexible for various creative applications. The repository contains pretrained models, various checkpoints, and tools to facilitate image generation tasks, such as fine-tuning and modifying the models. Stability AI's approach to image synthesis has contributed to creating detailed, scalable images while maintaining efficiency.
    Downloads: 201 This Week
    Last Update:
    See Project
  • 18
    Qwen3-TTS

    Qwen3-TTS

    Qwen3-TTS is an open-source series of TTS models

    Qwen3-TTS is an open-source text-to-speech (TTS) project built around the Qwen3 large language model family, focused on generating high-quality, natural-sounding speech from plain text input. It provides researchers and developers with tools to transform text into expressive, intelligible audio, supporting multiple languages and voice characteristics tuned for clarity and fluidity. The project includes pre-trained models and inference scripts that let users synthesize speech locally or integrate TTS into larger pipelines such as voice assistants, accessibility tools, or multimedia generation workflows. Because it’s part of the broader Qwen ecosystem, it benefits from the model’s understanding of linguistic nuances, enabling more accurate pronunciation, prosody, and contextual delivery than many traditional TTS systems. Developers can customize voice output parameters like speed, pitch, and volume, and combine the TTS stack with other AI components.
    Downloads: 21 This Week
    Last Update:
    See Project
  • 19
    Hunyuan3D-2.1

    Hunyuan3D-2.1

    From Images to High-Fidelity 3D Assets

    Hunyuan3D-2.1 is Tencent Hunyuan’s advanced 3D asset generation system that produces high-fidelity 3D models with Physically Based Rendering (PBR) textures. It is fully open-source with released model weights, training, and inference code. It improves on prior versions by using a PBR texture pipeline (enabling realistic material effects like reflections and subsurface scattering) and allowing community fine-tuning and extension. It supports both shape generation (mesh geometry) and texture generation modules. Physically Based Rendering texture synthesis to model realistic material effects, including reflections, subsurface scattering, etc. Cross-platform support (MacOS, Windows, Linux) via Python / PyTorch, including diffusers-style APIs.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 20
    OpenMythos

    OpenMythos

    A theoretical reconstruction of the Claude Mythos architecture

    OpenMythos is an experimental, open-source implementation that attempts to reconstruct a hypothesized architecture behind advanced language models using a design called a Recurrent-Depth Transformer. The project explores the idea that instead of stacking hundreds of unique transformer layers, a smaller set of layers can be reused iteratively during inference to achieve deeper reasoning without increasing parameter count. It divides computation into three main stages, including a pre-processing phase, a looped recurrent reasoning block, and a final output refinement stage, creating a structured pipeline for inference. The architecture incorporates advanced techniques such as mixture-of-experts routing, adaptive computation time, and multiple attention mechanisms to dynamically allocate compute where needed. It is highly configurable through a centralized configuration system, allowing experimentation with different architectural parameters such as loop depth, attention type.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 21
    VibeVoice

    VibeVoice

    Open-source multi-speaker long-form text-to-speech model

    VibeVoice-1.5B is Microsoft’s frontier open-source text-to-speech (TTS) model designed for generating expressive, long-form, multi-speaker conversational audio such as podcasts. Unlike traditional TTS systems, it excels in scalability, speaker consistency, and natural turn-taking for up to 90 minutes of continuous speech with as many as four distinct speakers. A key innovation is its use of continuous acoustic and semantic speech tokenizers operating at an ultra-low frame rate of 7.5 Hz, enabling high audio fidelity with efficient processing of long sequences. The model integrates a Qwen2.5-based large language model with a diffusion head to produce realistic acoustic details and capture conversational context. Training involved curriculum learning with increasing sequence lengths up to 65K tokens, allowing VibeVoice to handle very long dialogues effectively. Safety mechanisms include an audible disclaimer and imperceptible watermarking in all generated audio to mitigate misuse risks.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 22
    HeartMuLa

    HeartMuLa

    A Family of Open Sourced Music Foundation Models

    HeartMuLa is the open-source library and reference implementation for the HeartMuLa family of music foundation models, designed to support both music generation and music-related understanding tasks in a cohesive stack. At the center is HeartMuLa, a music language model that generates music conditioned on inputs like lyrics and tags, with multilingual support that broadens the range of lyric-driven use cases. The project also includes HeartCodec, a music codec optimized for high reconstruction fidelity, enabling efficient tokenization and reconstruction workflows that are critical for training and generation pipelines. For text extraction from audio, it provides HeartTranscriptor, a Whisper-based model tuned specifically for lyrics transcription, which helps bridge generated or recorded audio back into structured text. It also introduces HeartCLAP, which aligns audio and text into a shared embedding space.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 23
    Qwen3.5

    Qwen3.5

    Qwen3.5 is the large language model series developed by Qwen team

    Qwen3.5 is part of Alibaba’s Qwen family of large language and multimodal foundation models, designed to power advanced AI applications such as chatbots, coding assistants, and autonomous agents. The project represents a significant step toward “agentic AI,” meaning models that can reason through multi-step tasks and interact with external tools or environments rather than only generating text. Qwen3.5 builds on earlier Qwen generations by improving multilingual understanding, reasoning ability, and efficiency, while also introducing native multimodal capabilities that allow the model to work with both language and visual inputs. Architecturally, the system leverages modern large-scale training techniques and mixture-of-experts style efficiency so that very large parameter counts can be used while keeping inference practical.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 24
    SAM 3D Objects

    SAM 3D Objects

    Models for object and human mesh reconstruction

    SAM 3D Objects is a foundation model that reconstructs full 3D geometry, texture, and spatial layout of objects and scenes from a single image. Given one RGB image and object masks (for example, from the Segment Anything family), it can generate a textured 3D mesh for each object, including pose and approximate scene layout. The model is specifically designed to be robust in real-world images with clutter, occlusions, small objects, and unusual viewpoints, where many earlier 3D-from-image systems struggle. It supports both single-object and multi-object generation, allowing you to reconstruct entire scenes rather than just isolated items. The repository provides code to run inference, a quickstart demo.py script, and environment setup instructions that connect to hosted checkpoints and configuration files. Outputs are aimed at downstream usability: the reconstructed assets are textured meshes suitable for further editing, rendering, or integration into 3D pipelines and engines.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 25
    ComfyUI-LTXVideo

    ComfyUI-LTXVideo

    LTX-Video Support for ComfyUI

    ComfyUI-LTXVideo is a bridge between ComfyUI’s node-based generative workflow environment and the LTX-Video multimedia processing framework, enabling creators to orchestrate complex video tasks within a visual graph paradigm. Instead of writing code to apply effects, transitions, edits, and data flows, users can assemble nodes that represent video inputs, transformations, and outputs, letting them prototype and automate video production pipelines visually. This integration empowers non-programmers and rapid-iteration teams to harness the performance of LTX-Video while maintaining the clarity and flexibility of a dataflow graph model. It supports nodes for common video operations like trimming, layering, color grading, and generative augmentations, making it suitable for everything from simple clip edits to complex sequences with conditional behavior.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB