cpu memory usage free download

Showing 13 open source projects for "cpu memory usage"

View related business solutions

AI Models Linux Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
1

FastSD CPU

Fast stable diffusion on CPU and AI PC

...The repository contains multiple interfaces including a desktop GUI for simple generation, an advanced web-based UI with support for extensions like LoRA and ControlNet, and a command-line interface for scripted usage or server deployments. With support for performance-oriented libraries such as OpenVINO and hardware acceleration on platforms like Intel AI PCs, FastSD CPU aims to shrink generation times dramatically compared with naive CPU implementations.

Downloads: 31 This Week

Last Update: 2026-05-02
See Project
2

Kitten TTS

State-of-the-art TTS model under 25MB

KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.

Downloads: 5 This Week

Last Update: 2026-02-24
See Project
3

Tencent-Hunyuan-Large

Open-source large language model family from Tencent Hunyuan

...It is designed with long-context capabilities, quantization support, and high performance on benchmarks across general reasoning, mathematics, language understanding, and Chinese / multilingual tasks. It aims to provide competitive capability with efficient deployment and inference. FP8 quantization support to reduce memory usage (~50%) while maintaining precision. High benchmarking performance on tasks like MMLU, MATH, CMMLU, C-Eval, etc.

Downloads: 2 This Week

Last Update: 2025-09-24
See Project
4

ChatGLM.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

ChatGLM.cpp is a C++ implementation of the ChatGLM-6B model, enabling efficient local inference without requiring a Python environment. It is optimized for running on consumer hardware.

Downloads: 10 This Week

Last Update: 2025-01-21
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

Mistral Finetune

Memory-efficient and performant finetuning of Mistral's models

mistral-finetune is an official lightweight codebase designed for memory-efficient and performant finetuning of Mistral’s open models (e.g. 7B, instruct variants). It builds on techniques like LoRA (Low-Rank Adaptation) to allow customizing models without full parameter updates, which reduces GPU memory footprint and training cost. The repo includes utilities for data preprocessing (e.g. reformat_data.py), validation scripts, and example YAML configs for training variants like 7B base or...

Downloads: 0 This Week

Last Update: 2025-10-04
See Project
6

CogVideo

Text and image to video generation: CogVideoX and CogVideo

CogVideo is an open-source family of advanced video generation models that can create videos from text, images, or existing video inputs. Built on large-scale Transformer and diffusion architectures, it enables multimodal generation across text-to-video, image-to-video, and video continuation tasks. The latest CogVideoX models offer higher resolution outputs, longer video durations, and improved controllability through prompt engineering. The project includes tools for inference,...

Downloads: 25 This Week

Last Update: 2025-10-04
See Project
7

Qwen

The official repo of Qwen chat & pretrained large language model

Qwen is a series of large language models developed by Alibaba Cloud, consisting of various pretrained versions like Qwen-1.8B, Qwen-7B, Qwen-14B, and Qwen-72B. These models, which range from smaller to larger configurations, are designed for a wide range of natural language processing tasks. They are openly available for research and commercial use, with Qwen's code and model weights shared on GitHub. Qwen's capabilities include text generation, comprehension, and conversation, making it a...

1 Review

Downloads: 11 This Week

Last Update: 2026-03-05
See Project
8

Z80-μLM

Z80-μLM is a 2-bit quantized language model

...The project sits at the intersection of machine learning and systems constraints, showing how model architecture, quantization, and inference code generation can be adapted to extreme memory and compute limits. It also functions as an educational reference for how to reduce inference to operations that fit an old-school instruction set and runtime environment.

1 Review

Downloads: 0 This Week

Last Update: 2026-01-27
See Project
9

Qwen2.5-Omni

Capable of understanding text, audio, vision, video

Qwen2.5-Omni is an end-to-end multimodal flagship model in the Qwen series by Alibaba Cloud, designed to process multiple modalities (text, images, audio, video) and generate responses both as text and natural speech in streaming real-time. It supports “Thinker-Talker” architecture, and introduces innovations for aligning modalities over time (for example synchronizing video/audio), robust speech generation, and low-VRAM/quantized versions to make usage more accessible. It holds...

Downloads: 0 This Week

Last Update: 2025-09-23
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

Nemotron 3

Large language model developed and released by NVIDIA

...It is the post-trained and FP8-quantized variant of the Nemotron 3 Nano model, meaning its weights and activations are represented in 8-bit floating point (FP8) to dramatically reduce memory usage and computational cost while retaining high accuracy. The base Nano architecture uses a hybrid Mamba-Transformer Mixture-of-Experts (MoE) design, allowing the model to activate only a small fraction of its 31.6 billion parameters per token, which improves speed and efficiency without sacrificing quality on complex queries. ...

Downloads: 0 This Week

Last Update: 2026-01-07
See Project
11

Nemotron 3 Super

Open language model developed by NVIDIA as part of Nemotron-3 family

...The model contains approximately 120 billion parameters, but employs a Mixture-of-Experts architecture that activates only a smaller subset of parameters during inference, improving computational efficiency while maintaining high capability. Its architecture combines Transformer attention layers with Mamba state-space components to balance long-context reasoning, memory efficiency, and high-quality language generation. The model is optimized for building AI agents that must perform complex tasks such as planning, tool usage, coding assistance, and multi-step reasoning.

Downloads: 0 This Week

Last Update: 2026-03-13
See Project
12

MiMo-V2.5-Pro

Flagship MoE model for long-context agents and complex coding

...The model supports a 1 million token context window, enabling it to maintain coherence across long workflows involving thousands of tool calls and multi-step reasoning chains. Architecturally, it uses a hybrid attention system combining Sliding Window Attention and Global Attention to significantly reduce memory usage while preserving long-context performance. It also integrates multi-token prediction modules that accelerate inference and improve reinforcement learning efficiency. Trained on around 27 trillion tokens with FP8 mixed precision and refined through supervised fine-tuning, large-scale agentic reinforcement learning, and distillation.

Downloads: 0 This Week

Last Update: 2026-05-04
See Project
13

Mistral Large 3 675B Instruct 2512 NVFP4

Quantized 675B multimodal instruct model optimized for NVFP4

Mistral Large 3 675B Instruct 2512 NVFP4 is a frontier-scale multimodal Mixture-of-Experts model featuring 675B total parameters and 41B active parameters, trained from scratch on 3,000 H200 GPUs. This NVFP4 checkpoint is a post-training-activation quantized version of the original instruct model, created through a collaboration between Mistral AI, vLLM, and Red Hat using llm-compressor. It retains the same instruction-tuned behavior as the FP8 model, making it ideal for production...

Downloads: 0 This Week

Last Update: 2025-12-03
See Project