native free download - SourceForge

35 projects for "native" with 2 filters applied:

AI Models BSD Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

LTX-2

Python inference and LoRA trainer package for the LTX-2 audio–video

...Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries, resource loaders, utilities for texture and buffer handling, and integration points for native event loops and input systems. The framework targets both interactive graphical applications and media-rich experiences, making it a solid foundation for games, creative tools, or visualization systems that demand both performance and flexibility. While being low-level, it also provides sensible defaults and helper abstractions that reduce boilerplate and help teams maintain clear, maintainable code.

Downloads: 16 This Week

Last Update: 2026-05-28
See Project
2

LTX-2.3

Official Python inference and LoRA trainer package

LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...

Downloads: 119 This Week

Last Update: 2026-05-28
See Project
3

HunyuanImage-3.0

A Powerful Native Multimodal Model for Image Generation

HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter counts without linear inference cost explosion. ...

1 Review

Downloads: 4 This Week

Last Update: 2026-02-03
See Project
4

GLM-4.6V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and can output or act via tools seamlessly, bridging perception and execution. ...

Downloads: 1 This Week

Last Update: 2026-05-16
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
5

Qwen3.5

Qwen3.5 is the large language model series developed by Qwen team

...The project represents a significant step toward “agentic AI,” meaning models that can reason through multi-step tasks and interact with external tools or environments rather than only generating text. Qwen3.5 builds on earlier Qwen generations by improving multilingual understanding, reasoning ability, and efficiency, while also introducing native multimodal capabilities that allow the model to work with both language and visual inputs. Architecturally, the system leverages modern large-scale training techniques and mixture-of-experts style efficiency so that very large parameter counts can be used while keeping inference practical.

Downloads: 19 This Week

Last Update: 2026-06-03
See Project
6

MiniMind-O

A 0.1B Omni model trained from scratch

...It includes both mini and full training data paths, allowing learners to run a complete workflow quickly or reproduce the released model setup more closely. The implementation emphasizes native PyTorch code instead of relying on high-level third-party abstractions. minimind-o is most useful for developers and researchers who want to understand how multimodal and speech-capable AI systems are built from the ground up.

Downloads: 0 This Week

Last Update: 2026-06-08
See Project
7

FLUX.2

Official inference repo for FLUX.2 models

FLUX.2 is a state-of-the-art open-weight image generation and editing model released by Black Forest Labs aimed at bridging the gap between research-grade capabilities and production-ready workflows. The model offers both text-to-image generation and powerful image editing, including editing of multiple reference images, with fidelity, consistency, and realism that push the limits of what open-source generative models have achieved. It supports high-resolution output (up to ~4 megapixels),...

Downloads: 31 This Week

Last Update: 2026-03-12
See Project
8

IQuest-Coder-V1 Model Family

New family of code large language models (LLMs)

...These models range from tens of billions to smaller footprints and are trained on a novel code-flow multi-stage paradigm that captures how real software evolves over time — not just static code snapshots — giving them a deeper semantic understanding of programming logic. They support native long contexts up to 128K tokens, enabling them to reason across large codebases and multi-file interactions without context fragmentation, and include “Thinking” variants optimized for complex reasoning and “Loop” variants with recurrent mechanisms to improve inference efficiency. IQuest-Coder-V1 delivers state-of-the-art performance on multiple coding benchmarks, demonstrating strong results in competitive programming, tool use, and agentic code generation.

Downloads: 0 This Week

Last Update: 2026-03-02
See Project
9

MiniMax-M1

Open-weight, large-scale hybrid-attention reasoning model

...It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to support a native context length of 1 million tokens while using far fewer FLOPs than comparable reasoning models for very long generations. The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. ...

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
10

gemini-web2api

Convert Google Gemini web into OpenAI-compatible API

gemini-web2api is a Python bridge that exposes Google Gemini web access through OpenAI-compatible API endpoints. It is designed to let OpenAI-style clients connect to Gemini-like models through routes such as chat completions, models, responses, and native Gemini-compatible endpoints. The project can run as a simple local server and uses a mostly single-file design with an optional dependency for streaming. It supports model aliases for Flash, Thinking, Pro-style routing, Auto, and Lite variants. The tool also includes optional API keys, function calling, SSE streaming, web search access, Docker deployment, and client examples for OpenAI SDK-style usage. ...

Downloads: 0 This Week

Last Update: 1 day ago
See Project
11

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

FlashMLA is a high-performance decoding kernel library designed especially for Multi-Head Latent Attention (MLA) workloads, targeting NVIDIA Hopper GPU architectures. It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style. The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to...

Downloads: 0 This Week

Last Update: 2026-04-29
See Project
12

GLM-4-32B-0414

Open Multilingual Multimodal Chat LMs

GLM-4-32B-0414 is a powerful open-source large language model featuring 32 billion parameters, designed to deliver performance comparable to leading models like OpenAI’s GPT series. It supports multilingual and multimodal chat capabilities with an extensive 32K token context length, making it ideal for dialogue, reasoning, and complex task completion. The model is pre-trained on 15 trillion tokens of high-quality data, including substantial synthetic reasoning datasets, and further enhanced...

Downloads: 0 This Week

Last Update: 2025-06-27
See Project
13

Kimi K2.6

Multimodal agent model for coding, orchestration, and autonomy

Kimi K2.6 is an open-source native multimodal agentic model built for advanced autonomous execution, long-horizon coding, and large-scale task orchestration. It is designed to handle complex end-to-end software workflows across multiple languages and domains, including front-end development, DevOps, performance optimization, and coding-driven design. Beyond coding, it can transform prompts and visual inputs into production-ready interfaces and lightweight full-stack outputs with structured layouts, interactivity, and polished visual detail. ...

Downloads: 0 This Week

Last Update: 2026-04-20
See Project
14

gpt-oss-20b

OpenAI’s compact 20B open model for fast, agentic, and local use

GPT-OSS-20B is OpenAI’s smaller, open-weight language model optimized for low-latency, agentic tasks, and local deployment. With 21B total parameters and 3.6B active parameters (MoE), it fits within 16GB of memory thanks to native MXFP4 quantization. Designed for high-performance reasoning, it supports Harmony response format, function calling, web browsing, and code execution. Like its larger sibling (gpt-oss-120b), it offers adjustable reasoning depth and full chain-of-thought visibility for better interpretability. It’s released under a permissive Apache 2.0 license, allowing unrestricted commercial and research use. ...

Downloads: 0 This Week

Last Update: 2025-08-05
See Project
15

gpt-oss-120b

OpenAI’s open-weight 120B model optimized for reasoning and tooling

GPT-OSS-120B is a powerful open-weight language model by OpenAI, optimized for high-level reasoning, tool use, and agentic tasks. With 117B total parameters and 5.1B active parameters, it’s designed to fit on a single H100 GPU using native MXFP4 quantization. The model supports fine-tuning, chain-of-thought reasoning, and structured outputs, making it ideal for complex workflows. It operates in OpenAI’s Harmony response format and can be deployed via Transformers, vLLM, Ollama, LM Studio, and PyTorch. Developers can control the reasoning level (low, medium, high) to balance speed and depth depending on the task. ...

Downloads: 0 This Week

Last Update: 2025-08-05
See Project
16

Qwen3.6-27B

Dense multimodal Qwen model for coding, agents, and long context

Qwen3.6-27B is an open-weight multimodal model built to deliver strong real-world coding, agent, and long-context performance in a dense 27B-parameter architecture. It combines a causal language model with a vision encoder and supports text, image, and video inputs, making it suitable for both software workflows and broader multimodal tasks. The model emphasizes stability and practical developer utility, with major improvements in agentic coding, frontend generation, and repository-level...

Downloads: 0 This Week

Last Update: 2026-04-22
See Project
17

Qwen3.6-35B-A3B

Open multimodal model for coding, agents, and long-context tasks

...A notable addition is thinking preservation, which allows the model to retain reasoning context from earlier messages, improving iterative work and reducing redundant computation. Architecturally, it uses a Mixture-of-Experts design with 35B total parameters and 3B active, supports a native 262K-token context window, and can be extended to about 1M tokens with YaRN. It also performs strongly across coding, agent, vision, reasoning, and document-understanding benchmarks.

Downloads: 0 This Week

Last Update: 2026-04-20
See Project
18

Gemopus

Stable fine-tuned Gemma model for structured, clear responses

Gemopus is a supervised fine-tuned version of the Gemma 4 26B instruction model, designed with a “stability first” philosophy that prioritizes reliable reasoning structure over aggressive chain-of-thought imitation. Instead of relying on distilled reasoning traces from external models, it focuses on preserving Gemma’s native reasoning style while improving answer clarity, structure, and consistency. The model enhances response organization through better use of formatting, improves readability, and delivers more natural conversational outputs by removing rigid or overly mechanical tones. It also strengthens technical explanations, balancing rigor with accessibility. ...

Downloads: 0 This Week

Last Update: 2026-04-14
See Project
19

DiffusionGemma

NVFP4 DiffusionGemma model for fast multimodal text generation

...Its diffusion-based generation produces tokens in parallel 256-token blocks, enabling very high-speed output, with reported generation above 1,100 tokens per second on NVIDIA Hopper H100 in FP8. The model supports a 256K-token context window, configurable thinking mode, native function calling, structured JSON output, and multilingual inference across 35+ languages. The NVFP4 quantization reduces weights and activations from 16-bit to 4-bit, lowering disk size and GPU memory needs for vLLM deployment.

Downloads: 0 This Week

Last Update: 5 days ago
See Project
20

Laguna XS.2

Open agentic coding model optimized for local deployment

...It uses a hybrid attention architecture that combines Sliding Window Attention and global attention layers, reducing memory requirements and improving inference speed. Laguna XS.2 supports native reasoning with interleaved thinking between tool calls, enabling more capable autonomous coding agents and multi-step workflows. The model features a 262K-token context window, preserved reasoning across interactions, FP8 KV-cache optimization, and compatibility with local deployment ecosystems such as Ollama and vLLM.

Downloads: 0 This Week

Last Update: 6 days ago
See Project
21

Gemma 4 12B

Unified multimodal Gemma model for local coding and reasoning

...It supports text, image, audio, and video inputs with text output, making it useful for transcription, image understanding, video analysis, coding, and agentic workflows. The model has 11.95B parameters, 48 layers, a 256K-token context window, and support for over 140 languages. It also includes configurable thinking modes, native system prompt support, function calling, and strong benchmark performance for its size. It is optimized for consumer GPUs, workstations, and streamlined local deployment.

Downloads: 0 This Week

Last Update: 2026-06-03
See Project
22

Gemma 4

Google’s flagship dense multimodal model for coding and reasoning

...Built as the most capable model in the Gemma 4 family, it combines strong reasoning performance with a large 256K-token context window and configurable thinking modes. Gemma 4 31B supports native function calling, structured outputs, and more than 140 languages, making it suitable for enterprise assistants, coding agents, document analysis, and multilingual applications. Google positions it as a frontier-level model that can run on consumer GPUs and workstations while achieving leading results across reasoning, mathematics, coding, and multimodal benchmarks.

Downloads: 0 This Week

Last Update: 2026-06-03
See Project
23

MiMo-V2.5

Omnimodal AI model for agents, coding, and long-context tasks

MiMo-V2.5 is a native omnimodal large language model developed by Xiaomi, designed for advanced agentic workflows, multimodal reasoning, and long-context processing. Built on a Mixture-of-Experts architecture with approximately 309B total parameters and around 15B activated per inference, it balances high capability with efficient execution. The model natively processes text, images, video, and audio within a unified system, enabling cross-modal understanding and complex task execution in a single pipeline. ...

Downloads: 0 This Week

Last Update: 2026-05-04
See Project
24

Qwen3.6-35B-A3B-FP8

FP8 Qwen model for efficient multimodal coding and agent tasks

...A key capability is thinking preservation, which allows the model to retain reasoning traces from earlier messages, helping reduce repeated computation and improving consistency in iterative tasks. The model uses a Mixture-of-Experts design with 35B total parameters and 3B active, supports a native context window of 262,144 tokens, and can be extended to about 1,010,000 tokens with YaRN. It is compatible with major inference frameworks such as Transformers, vLLM, SGLang, and KTransformers, making it a practical high-performance option.

Downloads: 0 This Week

Last Update: 2026-04-20
See Project
25

Ministral 3 8B Instruct 2512

Compact 8B multimodal instruct model optimized for edge deployment

...Its multilingual support covers dozens of major languages, allowing it to work across diverse global environments and applications. The model adheres reliably to system prompts, supports native function calling, and outputs clean JSON, giving it strong tool-use behavior.

Downloads: 0 This Week

Last Update: 2025-12-03
See Project