size free download - SourceForge

Showing 31 open source projects for "size"

View related business solutions

AI Models Clear Filters & Widen Search

Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
1

Kitten TTS

State-of-the-art TTS model under 25MB

KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.

Downloads: 1 This Week

Last Update: 2026-02-24
See Project
2

FramePack

Lets make video diffusion practical

FramePack explores compact representations for sequences of image frames, targeting tasks where many near-duplicate frames carry redundant information. The idea is to “pack” frames by detecting shared structure and storing differences efficiently, which can accelerate training or inference on video-like data. By reducing I/O and memory bandwidth, datasets become lighter to load while models still see the essential temporal variation. The repository demonstrates both packing and unpacking...

Downloads: 44 This Week

Last Update: 2025-10-21
See Project
3

Z-Image

Image generation model with single-stream diffusion transformer

...The project includes several variants: Z-Image-Turbo, a distilled version optimized for speed and low resource consumption; Z-Image-Base, the full-capacity foundation model; and Z-Image-Edit, fine-tuned for image editing tasks. Despite its compact size, Z-Image produces outputs that closely rival those from much larger models — including strong rendering of bilingual (English and Chinese) text inside images, accurate prompt adherence, and good layout and composition.

Downloads: 30 This Week

Last Update: 2026-02-09
See Project
4

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

...It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style. The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. On very compute-bound settings, it can reach up to ~660 TFLOPS on H800 SXM5 hardware, while in memory-bound configurations it can push memory throughput to ~3000 GB/s. The team regularly updates it with performance improvements; for example, a 2025 update claims 5 % to 15 % gains on compute-bound workloads while maintaining API compatibility.

Downloads: 0 This Week

Last Update: 2026-04-29
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
5

GLM-4.1V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

GLM-4.1V — often referred to as a smaller / lighter version of the GLM-V family — offers a more resource-efficient option for users who want multimodal capabilities without requiring large compute resources. Though smaller in scale, GLM-4.1V maintains competitive performance, particularly impressive on many benchmarks for models of its size: in fact, on a number of multimodal reasoning and vision-language tasks it outperforms some much larger models from other families. It represents a trade-off: somewhat reduced capacity compared to 4.5V or 4.6V, but with benefits in terms of speed, deployability, and lower hardware requirements — making it especially useful for developers experimenting locally, building lightweight agents, or deploying on limited infrastructure. ...

Downloads: 0 This Week

Last Update: 2026-05-16
See Project
6

GLM-OCR

Accurate × Fast × Comprehensive

...Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. The model’s multimodal capabilities allow it to reason across image and text content holistically, capturing structured and unstructured information from pages that include dense tables, seals, code snippets, and varied document graphics. GLM-OCR integrates a comprehensive SDK and inference toolchain that makes it easy for developers to install, invoke, and embed into production pipelines with simple commands or APIs.

Downloads: 1 This Week

Last Update: 2026-04-08
See Project
7

MiniMind-O

A 0.1B Omni model trained from scratch

...It extends the MiniMind family by exploring a model that can handle text, audio, and image inputs while producing text and streaming speech outputs. The project is designed to make multimodal AI training more accessible by keeping the model size small enough for ordinary personal hardware. It includes both mini and full training data paths, allowing learners to run a complete workflow quickly or reproduce the released model setup more closely. The implementation emphasizes native PyTorch code instead of relying on high-level third-party abstractions. minimind-o is most useful for developers and researchers who want to understand how multimodal and speech-capable AI systems are built from the ground up.

Downloads: 0 This Week

Last Update: 2026-06-08
See Project
8

Step3-VL-10B

Multimodal model achieving SOTA performance

Step3-VL-10B is an open-source multimodal foundation model developed by StepFun AI that pushes the boundaries of what compact models can achieve by combining visual and language understanding in a single architecture. Despite having only about 10 billion parameters, it delivers performance that rivals or even surpasses much larger models (10×–20× larger) on a wide range of multimodal benchmarks covering reasoning, perception, and complex tasks, positioning it as one of the most powerful...

Downloads: 0 This Week

Last Update: 2026-01-22
See Project
9

GLM-4.5V

GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

GLM-4.5V is the preceding iteration in the GLM-V series that laid much of the groundwork for general multimodal reasoning and vision-language understanding. It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding,...

Downloads: 1 This Week

Last Update: 2026-05-16
See Project
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
10

Qwen2-Audio

Repo of Qwen2-Audio chat & pretrained large audio language model

Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...

Downloads: 0 This Week

Last Update: 2025-09-23
See Project
11

Tongyi DeepResearch

Tongyi Deep Research, the Leading Open-source Deep Research Agent

...It’s built to act like a research agent: synthesizing, reasoning, retrieving information via the web and documents, and backing its outputs with evidence. The model is about 30.5 billion parameters in size, though at any given token only ~3.3B parameters are active. It uses a mix of synthetic data generation, fine-tuning and reinforcement learning; supports benchmarks like web search, document understanding, question answering, “agentic” tasks; provides inference tools, evaluation scripts, and “web agent” style interfaces. The aim is to enable more autonomous, agentic models that can perform sustained knowledge gathering, reasoning, and synthesis across multiple modalities (web, files, etc.).

Downloads: 0 This Week

Last Update: 2026-02-27
See Project
12

Grok-1

Open-source, high-performance Mixture-of-Experts large language model

...In March 2024, xAI released Grok-1's model weights and architecture under the Apache 2.0 license, making them openly accessible to developers. The accompanying GitHub repository provides JAX example code for loading and running the model. Due to its substantial size, utilizing Grok-1 requires a machine with significant GPU memory. The repository's MoE layer implementation prioritizes correctness over efficiency, avoiding the need for custom kernels. This is a full repo snapshot ZIP file of the Grok-1 code.

1 Review

Downloads: 15 This Week

Last Update: 2025-02-27
See Project
13

RoBERTa for Chinese

RoBERTa Chinese pre-training model: RoBERTa for Chinese

...It provides TensorFlow and PyTorch-compatible model releases trained on large-scale Chinese text. The project follows the main RoBERTa training ideas, including removing next sentence prediction, using more diverse data, training longer, increasing batch size, and tuning optimization settings. Its training data includes news, community discussion, encyclopedia content, and other broad Chinese text sources. The repository also describes whole word masking for Chinese and provides examples for loading and fine-tuning models on sentence-pair matching tasks. Overall, it is a useful pretrained model resource for developers who want stronger Chinese BERT-style representations for classification, matching, reading comprehension, and related NLP tasks.

Downloads: 4 This Week

Last Update: 4 days ago
See Project
14

TimeSformer

The official pytorch implementation of our paper

TimeSformer is a vision transformer architecture for video that extends the standard attention mechanism into spatiotemporal attention. The model alternates attention along spatial and temporal dimensions (or designs variants like divided attention) so that it can capture both appearance and motion cues in video. Because the attention is global across frames, TimeSformer can reason about dependencies across long time spans, not just local neighborhoods. The official implementation in PyTorch...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
15

Llama-3.2-1B-Instruct

Instruction-tuned 1.2B LLM for multilingual text generation by Meta

...The model supports eight primary languages (including English, Spanish, Hindi, and Thai) and was trained on a curated mix of publicly available online data, with a December 2023 knowledge cutoff. Llama-3.2-1B is lightweight enough for deployment on constrained devices like smartphones, using formats like SpinQuant and QLoRA to reduce model size and latency. Despite its small size, it performs competitively across benchmarks such as MMLU, ARC, and TLDR summarization. The model is distributed under the Llama 3.2 Community License, requiring attribution and adherence to Meta’s Acceptable Use Policy.

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
16

GLM-4.5-Air

Compact hybrid reasoning language model for intelligent responses

...Open-sourced under the MIT license, it is commercially usable and integrates with transformers, vLLM, and SGLang inference frameworks. It includes FP8 variants for faster inference and reduced memory requirements. Despite its smaller size compared to full GLM-4.5, GLM-4.5-Air maintains high performance.

Downloads: 0 This Week

Last Update: 2025-07-31
See Project
17

t5-small

T5-Small: Lightweight text-to-text transformer for NLP tasks

...With only 60 million parameters, T5-Small is compact and suitable for fast inference or deployment in constrained environments. It was pretrained on the C4 dataset using both unsupervised denoising and supervised learning on tasks like sentiment analysis, NLI, and QA. Despite its size, it performs competitively across 24 NLP benchmarks, making it a strong candidate for prototyping and fine-tuning. T5-Small is compatible with major deep learning frameworks including PyTorch, TensorFlow, JAX, and ONNX. The model is open-source under the Apache 2.0 license and has wide support across Hugging Face's ecosystem.

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
18

Llama-3.2-1B

Llama 3.2–1B: Multilingual, instruction-tuned model for mobile AI

...The model supports eight officially listed languages (including Spanish, German, Hindi, and Thai) but can be adapted to more. Llama 3.2-1B outperforms other open models in several benchmarks relative to its size and offers quantized versions for efficiency. It uses a refined transformer architecture with Grouped-Query Attention (GQA) and supports long context windows of up to 128k tokens.

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
19

bart-large-cnn

Summarization model fine-tuned on CNN/DailyMail articles

...Its architecture allows it to model both language understanding and generation tasks effectively. The model supports usage in PyTorch, TensorFlow, and JAX, and is integrated with the Hugging Face pipeline API for simple deployment. Due to its size and performance, it's widely used in real-world summarization applications such as news aggregation, legal document condensing, and content creation.

Downloads: 0 This Week

Last Update: 2025-07-02
See Project
20

bge-base-en-v1.5

Efficient English embedding model for semantic search and retrieval

bge-base-en-v1.5 is an English sentence embedding model from BAAI optimized for dense retrieval tasks, part of the BGE (BAAI General Embedding) family. It is a fine-tuned BERT-based model designed to produce high-quality, semantically meaningful embeddings for tasks like semantic similarity, information retrieval, classification, and clustering. This version (v1.5) improves retrieval performance and stabilizes similarity score distribution without requiring instruction-based prompts. With...

Downloads: 0 This Week

Last Update: 2025-07-01
See Project
21

DiffusionGemma

NVFP4 DiffusionGemma model for fast multimodal text generation

...The model supports a 256K-token context window, configurable thinking mode, native function calling, structured JSON output, and multilingual inference across 35+ languages. The NVFP4 quantization reduces weights and activations from 16-bit to 4-bit, lowering disk size and GPU memory needs for vLLM deployment.

Downloads: 0 This Week

Last Update: 2026-06-12
See Project
22

Gemma 4 12B

Unified multimodal Gemma model for local coding and reasoning

...The model has 11.95B parameters, 48 layers, a 256K-token context window, and support for over 140 languages. It also includes configurable thinking modes, native system prompt support, function calling, and strong benchmark performance for its size. It is optimized for consumer GPUs, workstations, and streamlined local deployment.

Downloads: 0 This Week

Last Update: 2026-06-03
See Project
23

GigaChat 3 Ultra

High-performance MoE model with MLA, MTP, and multilingual reasoning

GigaChat 3 Ultra is a flagship instruct-model built on a custom Mixture-of-Experts architecture with 702B total and 36B active parameters. It leverages Multi-head Latent Attention to compress the KV cache into latent vectors, dramatically reducing memory demand and improving inference speed at scale. The model also employs Multi-Token Prediction, enabling multi-step token generation in a single pass for up to 40% faster output through speculative and parallel decoding techniques. Its...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project
24

Jan-v1-edge

Jan-v1-edge: efficient 1.7B reasoning model optimized for edge devices

Jan-v1-edge is a lightweight agentic language model developed by JanHQ, designed for fast and reliable on-device execution. It is the second release in the Jan Family and was distilled from the larger Jan-v1 model, retaining strong reasoning and problem-solving capabilities while reducing its computational footprint. The model was refined through a two-stage post-training process: Supervised Fine-Tuning (SFT) to transfer knowledge from Jan-v1, followed by Reinforcement Learning with...

Downloads: 0 This Week

Last Update: 2025-09-05
See Project
25

Qwen-Image-Edit

An advanced bilingual image editing with semantic control

...The model excels at semantic edits like style transfer, object rotation, and novel view synthesis, while also handling precise appearance edits such as adding or removing elements without altering surrounding regions. A standout feature is its bilingual text editing in English and Chinese, which preserves original font, size, and style during modifications. Benchmarks confirm its state-of-the-art performance in image editing, establishing it as a reliable foundation for both artistic and practical tasks. Its applications span IP creation, meme generation, background changes, clothing edits, and fine corrections in artworks or calligraphy.

Downloads: 0 This Week

Last Update: 2025-09-01
See Project