Showing 381 open source projects for "gnu/linux"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Ministral 3 8B Instruct 2512

    Ministral 3 8B Instruct 2512

    Compact 8B multimodal instruct model optimized for edge deployment

    Ministral 3 8B Instruct 2512 is a balanced, efficient model in the Ministral 3 family, offering strong multimodal capabilities within a compact footprint. It combines an 8.4B-parameter language model with a 0.4B vision encoder, enabling both text reasoning and image understanding. This FP8 instruct-fine-tuned variant is optimized for chat, instruction following, and structured outputs, making it ideal for daily assistant tasks and lightweight agentic workflows. Designed for edge deployment,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    GigaChat 3 Ultra

    GigaChat 3 Ultra

    High-performance MoE model with MLA, MTP, and multilingual reasoning

    GigaChat 3 Ultra is a flagship instruct-model built on a custom Mixture-of-Experts architecture with 702B total and 36B active parameters. It leverages Multi-head Latent Attention to compress the KV cache into latent vectors, dramatically reducing memory demand and improving inference speed at scale. The model also employs Multi-Token Prediction, enabling multi-step token generation in a single pass for up to 40% faster output through speculative and parallel decoding techniques. Its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DeepSeek-V3.2-Speciale

    DeepSeek-V3.2-Speciale

    High-compute ultra-reasoning model surpassing model surpassing GPT-5

    DeepSeek-V3.2-Speciale is the high-compute, ultra-reasoning variant of DeepSeek-V3.2, designed specifically to push the boundaries of mathematical, logical, and algorithmic intelligence. It builds on the DeepSeek Sparse Attention (DSA) framework, delivering dramatically improved long-context efficiency while preserving full model quality. Unlike the standard version, Speciale is tuned exclusively for deep reasoning and therefore does not support tool-calling, focusing its full capacity on...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    DeepSeek-V3.2

    DeepSeek-V3.2

    High-efficiency reasoning and agentic intelligence model

    DeepSeek-V3.2 is a cutting-edge large language model developed by DeepSeek-AI, focused on achieving high reasoning accuracy and computational efficiency for agentic tasks. It introduces DeepSeek Sparse Attention (DSA), a new attention mechanism that dramatically reduces computational overhead while maintaining strong long-context performance. Built with a scalable reinforcement learning framework, it reaches near-GPT-5 levels of reasoning and outperforms comparable models like DeepSeek-V3.1...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    DeepSeek-V3.1-Terminus

    DeepSeek-V3.1-Terminus

    685B model with improved agents and consistency

    DeepSeek-V3.1-Terminus is an updated release in the DeepSeek-V3.1 series, maintaining the original model’s large-scale reasoning and generative capabilities while addressing several key user-reported issues. It improves language consistency, reducing mixed Chinese-English outputs and eliminating abnormal characters, enhancing reliability in multilingual scenarios. The update also refines agentic capabilities, especially for the Code Agent and Search Agent, leading to better tool integration...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Qwen3-Next

    Qwen3-Next

    Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens

    Qwen3-Next-80B-A3B-Instruct is the flagship release in the Qwen3-Next series, designed as a next-generation foundation model for ultra-long context and efficient reasoning. With 80B total parameters and 3B activated at a time, it leverages hybrid attention (Gated DeltaNet + Gated Attention) and a high-sparsity Mixture-of-Experts architecture to achieve exceptional efficiency. The model natively supports a context length of 262K tokens and can be extended up to 1 million tokens using RoPE...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Mellum-4b-base

    Mellum-4b-base

    JetBrains’ 4B parameter code model for completions

    Mellum-4b-base is JetBrains’ first open-source large language model designed and optimized for code-related tasks. Built with 4 billion parameters and a LLaMA-style architecture, it was trained on over 4.2 trillion tokens across multiple programming languages, including datasets such as The Stack, StarCoder, and CommitPack. With a context window of 8,192 tokens, it excels at code completion, fill-in-the-middle tasks, and intelligent code suggestions for professional developer tools and IDEs....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Jan-v1-edge

    Jan-v1-edge

    Jan-v1-edge: efficient 1.7B reasoning model optimized for edge devices

    Jan-v1-edge is a lightweight agentic language model developed by JanHQ, designed for fast and reliable on-device execution. It is the second release in the Jan Family and was distilled from the larger Jan-v1 model, retaining strong reasoning and problem-solving capabilities while reducing its computational footprint. The model was refined through a two-stage post-training process: Supervised Fine-Tuning (SFT) to transfer knowledge from Jan-v1, followed by Reinforcement Learning with...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Qwen-Image-Edit

    Qwen-Image-Edit

    An advanced bilingual image editing with semantic control

    Qwen-Image-Edit is the image editing extension of Qwen-Image, a 20B parameter model that combines advanced visual and text-rendering capabilities for creative and precise editing. It leverages both Qwen2.5-VL for semantic control and a VAE Encoder for appearance control, enabling users to edit at both the content and detail level. The model excels at semantic edits like style transfer, object rotation, and novel view synthesis, while also handling precise appearance edits such as adding or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    OpenVLA 7B

    OpenVLA 7B

    Vision-language-action model for robot control via images and text

    OpenVLA 7B is a multimodal vision-language-action model trained on 970,000 robot manipulation episodes from the Open X-Embodiment dataset. It takes camera images and natural language instructions as input and outputs normalized 7-DoF robot actions, enabling control of multiple robot types across various domains. Built on top of LLaMA-2 and DINOv2/SigLIP visual backbones, it allows both zero-shot inference for known robot setups and parameter-efficient fine-tuning for new domains. The model...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Llama-3.2-1B-Instruct

    Llama-3.2-1B-Instruct

    Instruction-tuned 1.2B LLM for multilingual text generation by Meta

    Llama-3.2-1B-Instruct is Meta’s multilingual, instruction-tuned large language model with 1.24 billion parameters, optimized for dialogue, summarization, and retrieval tasks. It builds upon the Llama 3.1 architecture and incorporates fine-tuning techniques like SFT, DPO, and quantization-aware training for improved alignment, efficiency, and safety. The model supports eight primary languages (including English, Spanish, Hindi, and Thai) and was trained on a curated mix of publicly available...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Bio_ClinicalBERT

    Bio_ClinicalBERT

    ClinicalBERT model trained on MIMIC notes for clinical NLP tasks

    Bio_ClinicalBERT is a domain-specific language model tailored for clinical natural language processing (NLP), extending BioBERT with additional training on clinical notes. It was initialized from BioBERT-Base v1.0 and further pre-trained on all clinical notes from the MIMIC-III database (~880M words), which includes ICU patient records. The training focused on improving performance in tasks like named entity recognition and natural language inference within the healthcare domain. Notes were...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    CLIP-ViT-bigG-14-laion2B-39B-b160k

    CLIP-ViT-bigG-14-laion2B-39B-b160k

    CLIP ViT-bigG/14: Zero-shot image-text model trained on LAION-2B

    CLIP-ViT-bigG-14-laion2B-39B-b160k is a powerful vision-language model trained on the English subset of the LAION-5B dataset using the OpenCLIP framework. Developed by LAION and trained by Mitchell Wortsman on Stability AI’s compute infrastructure, it pairs a ViT-bigG/14 vision transformer with a text encoder to perform contrastive learning on image-text pairs. This model excels at zero-shot image classification, image-to-text and text-to-image retrieval, and can be adapted for tasks such as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    fashion-clip

    fashion-clip

    CLIP model fine-tuned for zero-shot fashion product classification

    FashionCLIP is a domain-adapted CLIP model fine-tuned specifically for the fashion industry, enabling zero-shot classification and retrieval of fashion products. Developed by Patrick John Chia and collaborators, it builds on the CLIP ViT-B/32 architecture and was trained on over 800K image-text pairs from the Farfetch dataset. The model learns to align product images and descriptive text using contrastive learning, enabling it to perform well across various fashion-related tasks without...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Hunyuan-A13B-Instruct

    Hunyuan-A13B-Instruct

    Efficient 13B MoE language model with long context and reasoning modes

    Hunyuan-A13B-Instruct is a powerful instruction-tuned large language model developed by Tencent using a fine-grained Mixture-of-Experts (MoE) architecture. While the total model includes 80 billion parameters, only 13 billion are active per forward pass, making it highly efficient while maintaining strong performance across benchmarks. It supports up to 256K context tokens, advanced reasoning (CoT) abilities, and agent-based workflows with tool parsing. The model offers both fast and slow...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    granite-timeseries-ttm-r2

    granite-timeseries-ttm-r2

    Tiny pre-trained IBM model for multivariate time series forecasting

    granite-timeseries-ttm-r2 is part of IBM’s TinyTimeMixers (TTM) series—compact, pre-trained models for multivariate time series forecasting. Unlike massive foundation models, TTM models are designed to be lightweight yet powerful, with only ~805K parameters, enabling high performance even on CPU or single-GPU machines. The r2 version is pre-trained on ~700M samples (r2.1 expands to ~1B), delivering up to 15% better accuracy than the r1 version. TTM supports both zero-shot and fine-tuned...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    wav2vec2-large-xlsr-53-russian

    wav2vec2-large-xlsr-53-russian

    Russian ASR model fine-tuned on Common Voice and CSS10 datasets

    wav2vec2-large-xlsr-53-russian is a fine-tuned automatic speech recognition (ASR) model based on Facebook’s wav2vec2-large-xlsr-53 and optimized for Russian. It was trained using Mozilla’s Common Voice 6.1 and CSS10 datasets to recognize Russian speech with high accuracy. The model operates best with audio sampled at 16kHz and can transcribe Russian speech directly without a language model. It achieves a Word Error Rate (WER) of 13.3% and Character Error Rate (CER) of 2.88% on the Common...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Mistral Large 3 675B Base 2512

    Mistral Large 3 675B Base 2512

    Frontier-scale 675B multimodal base model for custom AI training

    Mistral Large 3 675B Base 2512 is the foundational, pre-trained version of the Mistral Large 3 family, built as a frontier-scale multimodal Mixture-of-Experts model with 41B active parameters and a total size of 675B. It is trained from scratch using 3000 H200 GPUs, making it one of the most advanced and compute-intensive open-weight models available. As the base version, it is not fine-tuned for instruction following or reasoning, making it ideal for teams planning their own domain-specific...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Mistral Large 3 675B Instruct 2512 Eagle

    Mistral Large 3 675B Instruct 2512 Eagle

    Speculative-decoding accelerator for the 675B Mistral Large 3

    Mistral Large 3 675B Instruct 2512 Eagle is the dedicated speculative-decoding draft model for the full Mistral Large 3 Instruct system, designed to significantly speed up generation while preserving high output quality. It works alongside the primary 675B instruct model, enabling faster response times by predicting several tokens ahead using Mistral’s Eagle speculative method. Built on the same frontier-scale multimodal Mixture-of-Experts architecture, it complements a system featuring 41B...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Mistral Large 3 675B Instruct 2512 NVFP4

    Mistral Large 3 675B Instruct 2512 NVFP4

    Quantized 675B multimodal instruct model optimized for NVFP4

    Mistral Large 3 675B Instruct 2512 NVFP4 is a frontier-scale multimodal Mixture-of-Experts model featuring 675B total parameters and 41B active parameters, trained from scratch on 3,000 H200 GPUs. This NVFP4 checkpoint is a post-training-activation quantized version of the original instruct model, created through a collaboration between Mistral AI, vLLM, and Red Hat using llm-compressor. It retains the same instruction-tuned behavior as the FP8 model, making it ideal for production...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Ministral 3 3B Base 2512

    Ministral 3 3B Base 2512

    Small 3B-base multimodal model ideal for custom AI on edge hardware

    Ministral 3 3B Base 2512 is the smallest model in the Ministral 3 family, offering a compact yet capable multimodal architecture suited for lightweight AI applications. It combines a 3.4B-parameter language model with a 0.4B vision encoder, enabling both text and image understanding in a tiny footprint. As the base pretrained model, it is not fine-tuned for instructions or reasoning, making it the ideal foundation for custom post-training, domain adaptation, or specialized downstream tasks....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Ministral 3 8B Reasoning 2512

    Ministral 3 8B Reasoning 2512

    Efficient 8B multimodal model tuned for advanced reasoning tasks.

    Ministral 3 8B Reasoning 2512 is a balanced midsize model in the Ministral 3 family, delivering strong multimodal reasoning capabilities within an efficient footprint. It combines an 8.4B-parameter language model with a 0.4B vision encoder, enabling it to process both text and images for advanced reasoning tasks. This version is specifically post-trained for reasoning, making it well-suited for math, coding, and STEM applications requiring multi-step logic and problem-solving. Despite its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Ministral 3 14B Reasoning 2512

    Ministral 3 14B Reasoning 2512

    High-precision 14B multimodal model built for advanced reasoning tasks

    Ministral 3 14B Reasoning 2512 is the largest model in the Ministral 3 series, delivering frontier-level performance with capabilities comparable to the Mistral Small 3.2 24B model. It pairs a 13.5B-parameter language model with a 0.4B vision encoder, enabling strong multimodal reasoning across both text and images. This version is specifically post-trained for reasoning tasks, making it highly effective for math, coding, STEM workloads, and complex multi-step problem-solving. Despite its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Ministral 3 3B Instruct 2512

    Ministral 3 3B Instruct 2512

    Ultra-efficient 3B multimodal instruct model built for edge deployment

    Ministral 3 3B Instruct 2512 is the smallest model in the Ministral 3 family, offering a lightweight yet capable multimodal architecture designed for edge and low-resource deployments. It includes a 3.4B-parameter language model paired with a 0.4B vision encoder, enabling it to understand both text and visual inputs. As an FP8 instruct-fine-tuned model, it is optimized for chat, instruction following, and compact agentic tasks while maintaining strong adherence to system prompts. Despite its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Ministral 3 14B Instruct 2512

    Ministral 3 14B Instruct 2512

    Efficient 14B multimodal instruct model with edge deployment and FP8

    Ministral 3 14B Instruct 2512 is the largest model in the Ministral 3 family, delivering frontier performance comparable to much larger systems while remaining optimized for edge-level deployment. It combines a 13.5B-parameter language model with a 0.4B-parameter vision encoder, enabling strong multimodal understanding in both text and image tasks. This FP8 instruct-tuned variant is designed specifically for chat, instruction following, and agentic workflows with robust system-prompt...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB