Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
AI Models
Search Results

Search Results for "gnu/linux" - Page 15

x

Sort By:

Relevance

Clear All Filters

OS

Linux 381
Mac 349
Windows 340
More...
BSD 261
ChromeOS 261
Mobile Operating Systems 5

Category

Artificial Intelligence 381
Multimedia 7
Scientific/Engineering 6
Software Development 4
Business 1
Education 1
Security 1

License

OSI-Approved Open Source 254
Creative Commons Attribution License 12
Other License 3

Translations

English 3
Chinese (Simplified) 1
Chinese (Traditional) 1

Programming Language

Python 248
C++ 18
Unix Shell 14
C 5
More...
JavaScript 5
TypeScript 4
C# 2
Go 2
Lua 2
Rust 1
Tcl 1

Status

Alpha 1

Showing 381 open source projects for "gnu/linux"

View related business solutions

AI Models Linux Clear Filters & Widen Search

Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

Ministral 3 8B Instruct 2512

Compact 8B multimodal instruct model optimized for edge deployment

Ministral 3 8B Instruct 2512 is a balanced, efficient model in the Ministral 3 family, offering strong multimodal capabilities within a compact footprint. It combines an 8.4B-parameter language model with a 0.4B vision encoder, enabling both text reasoning and image understanding. This FP8 instruct-fine-tuned variant is optimized for chat, instruction following, and structured outputs, making it ideal for daily assistant tasks and lightweight agentic workflows. Designed for edge deployment,...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
2

GigaChat 3 Ultra

High-performance MoE model with MLA, MTP, and multilingual reasoning

GigaChat 3 Ultra is a flagship instruct-model built on a custom Mixture-of-Experts architecture with 702B total and 36B active parameters. It leverages Multi-head Latent Attention to compress the KV cache into latent vectors, dramatically reducing memory demand and improving inference speed at scale. The model also employs Multi-Token Prediction, enabling multi-step token generation in a single pass for up to 40% faster output through speculative and parallel decoding techniques. Its...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
3

DeepSeek-V3.2-Speciale

High-compute ultra-reasoning model surpassing model surpassing GPT-5

DeepSeek-V3.2-Speciale is the high-compute, ultra-reasoning variant of DeepSeek-V3.2, designed specifically to push the boundaries of mathematical, logical, and algorithmic intelligence. It builds on the DeepSeek Sparse Attention (DSA) framework, delivering dramatically improved long-context efficiency while preserving full model quality. Unlike the standard version, Speciale is tuned exclusively for deep reasoning and therefore does not support tool-calling, focusing its full capacity on...

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
4

DeepSeek-V3.2

High-efficiency reasoning and agentic intelligence model

DeepSeek-V3.2 is a cutting-edge large language model developed by DeepSeek-AI, focused on achieving high reasoning accuracy and computational efficiency for agentic tasks. It introduces DeepSeek Sparse Attention (DSA), a new attention mechanism that dramatically reduces computational overhead while maintaining strong long-context performance. Built with a scalable reinforcement learning framework, it reaches near-GPT-5 levels of reasoning and outperforms comparable models like DeepSeek-V3.1...

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
5

DeepSeek-V3.1-Terminus

685B model with improved agents and consistency

DeepSeek-V3.1-Terminus is an updated release in the DeepSeek-V3.1 series, maintaining the original model’s large-scale reasoning and generative capabilities while addressing several key user-reported issues. It improves language consistency, reducing mixed Chinese-English outputs and eliminating abnormal characters, enhancing reliability in multilingual scenarios. The update also refines agentic capabilities, especially for the Code Agent and Search Agent, leading to better tool integration...

Downloads: 0 This Week

Last Update: 2025-09-24
See Project
6

Qwen3-Next

Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens

Qwen3-Next-80B-A3B-Instruct is the flagship release in the Qwen3-Next series, designed as a next-generation foundation model for ultra-long context and efficient reasoning. With 80B total parameters and 3B activated at a time, it leverages hybrid attention (Gated DeltaNet + Gated Attention) and a high-sparsity Mixture-of-Experts architecture to achieve exceptional efficiency. The model natively supports a context length of 262K tokens and can be extended up to 1 million tokens using RoPE...

Downloads: 0 This Week

Last Update: 2025-09-12
See Project
7

Mellum-4b-base

JetBrains’ 4B parameter code model for completions

Mellum-4b-base is JetBrains’ first open-source large language model designed and optimized for code-related tasks. Built with 4 billion parameters and a LLaMA-style architecture, it was trained on over 4.2 trillion tokens across multiple programming languages, including datasets such as The Stack, StarCoder, and CommitPack. With a context window of 8,192 tokens, it excels at code completion, fill-in-the-middle tasks, and intelligent code suggestions for professional developer tools and IDEs....

Downloads: 0 This Week

Last Update: 2025-09-11
See Project
8

Jan-v1-edge

Jan-v1-edge: efficient 1.7B reasoning model optimized for edge devices

Jan-v1-edge is a lightweight agentic language model developed by JanHQ, designed for fast and reliable on-device execution. It is the second release in the Jan Family and was distilled from the larger Jan-v1 model, retaining strong reasoning and problem-solving capabilities while reducing its computational footprint. The model was refined through a two-stage post-training process: Supervised Fine-Tuning (SFT) to transfer knowledge from Jan-v1, followed by Reinforcement Learning with...

Downloads: 0 This Week

Last Update: 2025-09-05
See Project
9

Qwen-Image-Edit

An advanced bilingual image editing with semantic control

Qwen-Image-Edit is the image editing extension of Qwen-Image, a 20B parameter model that combines advanced visual and text-rendering capabilities for creative and precise editing. It leverages both Qwen2.5-VL for semantic control and a VAE Encoder for appearance control, enabling users to edit at both the content and detail level. The model excels at semantic edits like style transfer, object rotation, and novel view synthesis, while also handling precise appearance edits such as adding or...

Downloads: 0 This Week

Last Update: 2025-09-01
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

OpenVLA 7B

Vision-language-action model for robot control via images and text

OpenVLA 7B is a multimodal vision-language-action model trained on 970,000 robot manipulation episodes from the Open X-Embodiment dataset. It takes camera images and natural language instructions as input and outputs normalized 7-DoF robot actions, enabling control of multiple robot types across various domains. Built on top of LLaMA-2 and DINOv2/SigLIP visual backbones, it allows both zero-shot inference for known robot setups and parameter-efficient fine-tuning for new domains. The model...

Downloads: 0 This Week

Last Update: 2025-07-23
See Project
11

Llama-3.2-1B-Instruct

Instruction-tuned 1.2B LLM for multilingual text generation by Meta

Llama-3.2-1B-Instruct is Meta’s multilingual, instruction-tuned large language model with 1.24 billion parameters, optimized for dialogue, summarization, and retrieval tasks. It builds upon the Llama 3.1 architecture and incorporates fine-tuning techniques like SFT, DPO, and quantization-aware training for improved alignment, efficiency, and safety. The model supports eight primary languages (including English, Spanish, Hindi, and Thai) and was trained on a curated mix of publicly available...

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
12

Bio_ClinicalBERT

ClinicalBERT model trained on MIMIC notes for clinical NLP tasks

Bio_ClinicalBERT is a domain-specific language model tailored for clinical natural language processing (NLP), extending BioBERT with additional training on clinical notes. It was initialized from BioBERT-Base v1.0 and further pre-trained on all clinical notes from the MIMIC-III database (~880M words), which includes ICU patient records. The training focused on improving performance in tasks like named entity recognition and natural language inference within the healthcare domain. Notes were...

1 Review

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
13

CLIP-ViT-bigG-14-laion2B-39B-b160k

CLIP ViT-bigG/14: Zero-shot image-text model trained on LAION-2B

CLIP-ViT-bigG-14-laion2B-39B-b160k is a powerful vision-language model trained on the English subset of the LAION-5B dataset using the OpenCLIP framework. Developed by LAION and trained by Mitchell Wortsman on Stability AI’s compute infrastructure, it pairs a ViT-bigG/14 vision transformer with a text encoder to perform contrastive learning on image-text pairs. This model excels at zero-shot image classification, image-to-text and text-to-image retrieval, and can be adapted for tasks such as...

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
14

fashion-clip

CLIP model fine-tuned for zero-shot fashion product classification

FashionCLIP is a domain-adapted CLIP model fine-tuned specifically for the fashion industry, enabling zero-shot classification and retrieval of fashion products. Developed by Patrick John Chia and collaborators, it builds on the CLIP ViT-B/32 architecture and was trained on over 800K image-text pairs from the Farfetch dataset. The model learns to align product images and descriptive text using contrastive learning, enabling it to perform well across various fashion-related tasks without...

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
15

Hunyuan-A13B-Instruct

Efficient 13B MoE language model with long context and reasoning modes

Hunyuan-A13B-Instruct is a powerful instruction-tuned large language model developed by Tencent using a fine-grained Mixture-of-Experts (MoE) architecture. While the total model includes 80 billion parameters, only 13 billion are active per forward pass, making it highly efficient while maintaining strong performance across benchmarks. It supports up to 256K context tokens, advanced reasoning (CoT) abilities, and agent-based workflows with tool parsing. The model offers both fast and slow...

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
16

granite-timeseries-ttm-r2

Tiny pre-trained IBM model for multivariate time series forecasting

granite-timeseries-ttm-r2 is part of IBM’s TinyTimeMixers (TTM) series—compact, pre-trained models for multivariate time series forecasting. Unlike massive foundation models, TTM models are designed to be lightweight yet powerful, with only ~805K parameters, enabling high performance even on CPU or single-GPU machines. The r2 version is pre-trained on ~700M samples (r2.1 expands to ~1B), delivering up to 15% better accuracy than the r1 version. TTM supports both zero-shot and fine-tuned...

Downloads: 0 This Week

Last Update: 2025-07-01
See Project
17

wav2vec2-large-xlsr-53-russian

Russian ASR model fine-tuned on Common Voice and CSS10 datasets

wav2vec2-large-xlsr-53-russian is a fine-tuned automatic speech recognition (ASR) model based on Facebook’s wav2vec2-large-xlsr-53 and optimized for Russian. It was trained using Mozilla’s Common Voice 6.1 and CSS10 datasets to recognize Russian speech with high accuracy. The model operates best with audio sampled at 16kHz and can transcribe Russian speech directly without a language model. It achieves a Word Error Rate (WER) of 13.3% and Character Error Rate (CER) of 2.88% on the Common...

Downloads: 0 This Week

Last Update: 2025-07-01
See Project
18

Mistral Large 3 675B Base 2512

Frontier-scale 675B multimodal base model for custom AI training

Mistral Large 3 675B Base 2512 is the foundational, pre-trained version of the Mistral Large 3 family, built as a frontier-scale multimodal Mixture-of-Experts model with 41B active parameters and a total size of 675B. It is trained from scratch using 3000 H200 GPUs, making it one of the most advanced and compute-intensive open-weight models available. As the base version, it is not fine-tuned for instruction following or reasoning, making it ideal for teams planning their own domain-specific...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
19

Mistral Large 3 675B Instruct 2512 Eagle

Speculative-decoding accelerator for the 675B Mistral Large 3

Mistral Large 3 675B Instruct 2512 Eagle is the dedicated speculative-decoding draft model for the full Mistral Large 3 Instruct system, designed to significantly speed up generation while preserving high output quality. It works alongside the primary 675B instruct model, enabling faster response times by predicting several tokens ahead using Mistral’s Eagle speculative method. Built on the same frontier-scale multimodal Mixture-of-Experts architecture, it complements a system featuring 41B...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
20

Mistral Large 3 675B Instruct 2512 NVFP4

Quantized 675B multimodal instruct model optimized for NVFP4

Mistral Large 3 675B Instruct 2512 NVFP4 is a frontier-scale multimodal Mixture-of-Experts model featuring 675B total parameters and 41B active parameters, trained from scratch on 3,000 H200 GPUs. This NVFP4 checkpoint is a post-training-activation quantized version of the original instruct model, created through a collaboration between Mistral AI, vLLM, and Red Hat using llm-compressor. It retains the same instruction-tuned behavior as the FP8 model, making it ideal for production...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
21

Ministral 3 3B Base 2512

Small 3B-base multimodal model ideal for custom AI on edge hardware

Ministral 3 3B Base 2512 is the smallest model in the Ministral 3 family, offering a compact yet capable multimodal architecture suited for lightweight AI applications. It combines a 3.4B-parameter language model with a 0.4B vision encoder, enabling both text and image understanding in a tiny footprint. As the base pretrained model, it is not fine-tuned for instructions or reasoning, making it the ideal foundation for custom post-training, domain adaptation, or specialized downstream tasks....

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
22

Ministral 3 8B Reasoning 2512

Efficient 8B multimodal model tuned for advanced reasoning tasks.

Ministral 3 8B Reasoning 2512 is a balanced midsize model in the Ministral 3 family, delivering strong multimodal reasoning capabilities within an efficient footprint. It combines an 8.4B-parameter language model with a 0.4B vision encoder, enabling it to process both text and images for advanced reasoning tasks. This version is specifically post-trained for reasoning, making it well-suited for math, coding, and STEM applications requiring multi-step logic and problem-solving. Despite its...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
23

Ministral 3 14B Reasoning 2512

High-precision 14B multimodal model built for advanced reasoning tasks

Ministral 3 14B Reasoning 2512 is the largest model in the Ministral 3 series, delivering frontier-level performance with capabilities comparable to the Mistral Small 3.2 24B model. It pairs a 13.5B-parameter language model with a 0.4B vision encoder, enabling strong multimodal reasoning across both text and images. This version is specifically post-trained for reasoning tasks, making it highly effective for math, coding, STEM workloads, and complex multi-step problem-solving. Despite its...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
24

Ministral 3 3B Instruct 2512

Ultra-efficient 3B multimodal instruct model built for edge deployment

Ministral 3 3B Instruct 2512 is the smallest model in the Ministral 3 family, offering a lightweight yet capable multimodal architecture designed for edge and low-resource deployments. It includes a 3.4B-parameter language model paired with a 0.4B vision encoder, enabling it to understand both text and visual inputs. As an FP8 instruct-fine-tuned model, it is optimized for chat, instruction following, and compact agentic tasks while maintaining strong adherence to system prompts. Despite its...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
25

Ministral 3 14B Instruct 2512

Efficient 14B multimodal instruct model with edge deployment and FP8

Ministral 3 14B Instruct 2512 is the largest model in the Ministral 3 family, delivering frontier performance comparable to much larger systems while remaining optimized for edge-level deployment. It combines a 13.5B-parameter language model with a 0.4B-parameter vision encoder, enabling strong multimodal understanding in both text and image tasks. This FP8 instruct-tuned variant is designed specifically for chat, instruction following, and agentic workflows with robust system-prompt...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project

Previous
11
12
13
14
You're on page 15
16
Next

Related Categories

Artificial Intelligence

Multimedia

Scientific/Engineering

Software Development

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise