Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence Software
Search Results

Search Results for "cpu memory usage" - Page 5

x

Sort By:

Relevance

Clear All Filters

OS

Linux 140
Mac 131
Windows 130
More...
BSD 56
ChromeOS 56
Mobile Operating Systems 10

Category

Artificial Intelligence 140
Software Development 15
System 6
Business 2
Database 2
Multimedia 2
Scientific/Engineering 2
Communications 1
Education 1
Formats and Protocols 1
Games 1
Mobile 1
Productivity 1
Social sciences 1

License

OSI-Approved Open Source 122
GNU Free Documentation License 1

Translations

English 5
Chinese (Simplified) 1
Spanish 1

Programming Language

Python 81
C++ 22
JavaScript 13
Rust 7
More...
TypeScript 5
Go 4
C 3
Java 3
Unix Shell 3
C# 1
Dart 1
Kotlin 1
Lua 1

Status

Production/Stable 4
Beta 2

Showing 140 open source projects for "cpu memory usage"

View related business solutions

Artificial Intelligence Linux Clear Filters & Widen Search

Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
Build Securely on Azure with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
1

FastChat

Open platform for training, serving, and evaluating language models

FastChat is an open platform for training, serving, and evaluating large language model-based chatbots. If you do not have enough memory, you can enable 8-bit compression by adding --load-8bit to the commands above. This can reduce memory usage by around half with slightly degraded model quality. It is compatible with the CPU, GPU, and Metal backend. Vicuna-13B with 8-bit compression can run on a single NVIDIA 3090/4080/T4/V100(16GB) GPU. In addition to that, you can add --cpu-offloading to commands above to offload weights that don't fit on your GPU onto the CPU memory. ...

Downloads: 0 This Week

Last Update: 2024-02-11
See Project
2

Quarto Solver

Quarto Solver calculates optimal moves for Quarto and Quarto 2x2

You can calculate for every game state in Quarto and Quarto 2x2 an optimal move.

Downloads: 4 This Week

Last Update: 2024-06-01
See Project
3

KoboldCpp

Run GGUF models easily with a UI or API. One File. Zero Install.

KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original KoboldAI. It's a single self-contained distributable that builds off llama.cpp and adds many additional powerful features.

Downloads: 313 This Week

Last Update: 9 hours ago
See Project
4

HunyuanVideo-I2V

A Customizable Image-to-Video Model based on HunyuanVideo

HunyuanVideo-I2V is a customizable image-to-video generation framework developed by Tencent, extending the capabilities of HunyuanVideo. It allows for high-quality video creation from still images, using PyTorch and providing pre-trained model weights, inference code, and customizable training options. The system includes a LoRA training code for adding special effects and enhancing video realism, aiming to offer versatile and scalable solutions for generating videos from static image inputs.

1 Review

Downloads: 4 This Week

Last Update: 2025-03-10
See Project
Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
5

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis

AutoGPTQ is an implementation of GPTQ (Quantized GPT) that optimizes large language models (LLMs) for faster inference by reducing their computational footprint while maintaining accuracy.

Downloads: 0 This Week

Last Update: 2025-01-21
See Project
6

OpenChat for Linux

OpenChat for Linux — a fast, lightweight desktop client for ChatGPT

OpenChat for Linux is a desktop client for ChatGPT / OpenAI Chat designed specifically for Linux. It’s built with Tauri (Rust) for low resource usage and stability, and it uses a “message window” approach (keeps a small active slice of the conversation and loads more as you scroll) so long chats don’t bog down or crash the app. Downloads are available in common Linux formats (AppImage, Debian package, tarball), with additional packaging manifests for Flatpak, Snap, RPM, AUR, and Nix.

Downloads: 18 This Week

Last Update: 2026-03-18
See Project
7

ClawBridge

The OpenClaw Mobile Dashboard.

The OpenClaw Mobile Dashboard. Monitor agent's real-time thoughts, actions, track token costs, and manage tasks from anywhere using your pocket-sized Mission Control.

Downloads: 0 This Week

Last Update: 2026-02-26
See Project
8

Mixtral offloading

Run Mixtral-8x7B models in Colab or consumer desktops

Mixtral-Offloading is an open-source project designed to enable efficient inference of large Mixture-of-Experts language models such as Mixtral-8x7B on hardware with limited GPU memory. The project implements techniques that allow model components to be dynamically moved between CPU memory and GPU memory during inference, significantly reducing the amount of GPU VRAM required to run the model. This approach takes advantage of the sparse activation properties of mixture-of-experts architectures, where only a subset of expert networks are used for each token during generation. ...

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
9

OnnxStream

Lightweight inference library for ONNX files, written in C++

...Generally, major machine learning frameworks and libraries are focused on minimizing inference latency and/or maximizing throughput, all of which at the cost of RAM usage. So I decided to write a super small and hackable inference library specifically focused on minimizing memory consumption: OnnxStream. OnnxStream is based on the idea of decoupling the inference engine from the component responsible for providing the model weights, which is a class derived from WeightsProvider. A WeightsProvider specialization can implement any type of loading, caching, and prefetching of the model parameters.

Downloads: 12 This Week

Last Update: 2024-08-14
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

SuperAGI

A dev-first open source autonomous AI agent framework

...Connect to multiple Vector DBs to enhance your agent’s performance. Each agent is unique, use different models of your choice. Get insights into your agent’s performance and optimize accordingly. Control token usage to manage costs effectively. Enable your agents to learn and adapt by storing their memory. Get notified when agents get stuck in the loop, and provide proactive resolution. Read and store files generated by Agents.

Downloads: 0 This Week

Last Update: 2024-01-12
See Project
11

Firefly LLM

A large model training tool that supports training large models

Firefly is an open-source framework designed to simplify the training and fine-tuning of large language models through a unified and configurable workflow. The project provides a comprehensive environment where developers can perform tasks such as model pre-training, instruction tuning, and preference optimization using widely adopted machine learning techniques. Its architecture supports both full-parameter training and parameter-efficient strategies like LoRA and QLoRA, making it suitable...

Downloads: 0 This Week

Last Update: 2026-03-08
See Project
12

Punica

Serving multiple LoRA finetuned LLM as one

Punica is a system designed to efficiently serve multiple LoRA-fine-tuned large language models within a shared GPU environment. LoRA is a parameter-efficient fine-tuning method that allows developers to adapt large pretrained models to specific tasks by adding lightweight adapter layers rather than retraining the entire model. Punica introduces a serving architecture that allows multiple LoRA adapters to share the same base model during inference, significantly reducing memory consumption...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
13

gpu_poor

Calculate token/s & GPU memory requirement for any LLM

gpu_poor is an open-source tool designed to help developers determine whether their hardware is capable of running a specific large language model and to estimate the performance they can expect from it. The project focuses on calculating GPU memory requirements and predicted inference speed for different models, hardware configurations, and quantization strategies. By analyzing factors such as model size, context length, batch size, and GPU specifications, the system estimates how much VRAM...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
14

Synaptrix ChatGPT Desktop

Fuel your productivity with ChatGPT-Desktop

Fuel your productivity with ChatGPT-Desktop, blazingly fast and supercharged.

Downloads: 0 This Week

Last Update: 2024-07-01
See Project
15

ChatLLM Web

Chat with LLM like Vicuna totally in your browser with WebGPU

...To use this app, you need a browser that supports WebGPU, such as Chrome 113 or Chrome Canary. Chrome versions ≤ 112 are not supported. You will need a GPU with about 6.4GB of memory. If your GPU has less memory, the app will still run, but the response time will be slower. The first time you use the app, you will need to download the model. For the Vicuna-7b model that we are currently using, the download size is about 4GB. After the initial download, the model will be loaded from the browser cache for faster usage.

Downloads: 0 This Week

Last Update: 2023-08-25
See Project
16

Node ChatGPT API

A client implementation for ChatGPT and Bing AI

...Conversations are stored in memory by default, but you can optionally install a storage adapter to persist conversations to a database.

Downloads: 0 This Week

Last Update: 2023-05-31
See Project
17

LightSeq

A High Performance Library for Sequence Processing and Generation

Lightseq is a high-performance library focused on efficient inference and training for deep learning models, especially large language models (LLMs) and transformer-based architectures. Its goal is to optimize both memory usage and computational throughput, enabling faster training or inference on limited hardware while maintaining model quality. Lightseq provides optimized CUDA kernels, quantization strategies, and runtime optimizations tailored for transformer operations — which often are bottlenecks in conventional frameworks — thereby reducing memory footprint, improving speed, and making deployment of large-scale models more accessible. ...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
18

Whatlang-RS

Natural language detection library for Rust

Whatlang-RS is a Rust-based language detection library optimized for speed and accuracy, supporting a wide range of languages with probabilistic models.

Downloads: 0 This Week

Last Update: 2025-01-24
See Project
19

KoboldAI

Your gateway to GPT writing

...No matter if you want to use the free, fast power of Google Colab, your own high end graphics card, an online service you have an API key for (Like OpenAI or Inferkit) or if you rather just run it slower on your CPU you will be able to find a way to use KoboldAI that works for you.

Downloads: 149 This Week

Last Update: 2022-12-01
See Project
20

min(DALL·E)

min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch

...The only third-party dependencies are numpy, requests, pillow and torch. The required models will be downloaded to models_root if they are not already there. Set the dtype to torch.float16 to save GPU memory. If you have an Ampere architecture GPU you can use torch.bfloat16. Set the device to either cuda or "cpu". Once everything has finished initializing, call generate_image with some text as many times as you want. Use a positive seed for reproducible results. Higher values for supercondition_factor result in better agreement with the text but a narrower variety of generated images. ...

Downloads: 0 This Week

Last Update: 2022-08-04
See Project
21

flutter_ume

UME is an in-app debug kits platform for Flutter

flutter_ume is an in-app debug-kit platform for Flutter applications, developed by ByteDance’s Flutter Infra team. It lets developers embed a suite of debugging tools directly into a Flutter app (during development or debug builds), enabling inspection, performance monitoring, UI debugging, network request inspection, widget hierarchy introspection, and more — all from within the running app. UME bundles multiple “plugin kits” (e.g., UI inspector, performance monitor, device info panel,...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
22

Fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. We provide reference implementations of various sequence modeling papers. Recent work by Microsoft and Google has shown that data parallel training can be made significantly more efficient by sharding the model parameters and optimizer state across data parallel workers. These ideas are encapsulated in the...

Downloads: 0 This Week

Last Update: 2022-06-27
See Project
23

MACE

Deep learning inference framework optimized for mobile platforms

Mobile AI Compute Engine (or MACE for short) is a deep learning inference framework optimized for mobile heterogeneous computing on Android, iOS, Linux and Windows devices. Runtime is optimized with NEON, OpenCL and Hexagon, and Winograd algorithm is introduced to speed up convolution operations. The initialization is also optimized to be faster. Chip-dependent power options like big.LITTLE scheduling, Adreno GPU hints are included as advanced APIs. UI responsiveness guarantee is sometimes...

Downloads: 0 This Week

Last Update: 2022-01-13
See Project
24

TurboTransformers

Fast and user-friendly runtime for transformer inference

TurboTransformers is a high-performance inference framework optimized for running Transformer models efficiently on CPUs and GPUs. It improves latency and throughput for NLP applications.

Downloads: 0 This Week

Last Update: 2025-01-24
See Project
25

TextBrewer

A PyTorch-based knowledge distillation toolkit

...It includes various distillation techniques from both NLP and CV field and provides an easy-to-use distillation framework, which allows users to quickly experiment with the state-of-the-art distillation methods to compress the model with a relatively small sacrifice in the performance, increasing the inference speed and reducing the memory usage.

Downloads: 0 This Week

Last Update: 2025-01-22
See Project

Previous
1
2
3
4
You're on page 5
6
Next

Related Searches

koboldcpp

ai

chatbot code

quarto

gguf

image to video

chatgpt

reverse proxy

koboldai

windows optimizer

Related Categories

Artificial Intelligence

Software Development

System

Business

Database

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise