220 projects for "cpu memory usage" with 1 filter applied:

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    FlexLLMGen

    FlexLLMGen

    Running large language models on a single GPU

    ...The system focuses on high-throughput generation workloads where large batches of text must be processed quickly, such as large-scale data extraction or document analysis tasks. Instead of requiring expensive multi-GPU systems, the framework uses techniques such as memory offloading, compression, and optimized batching to run large models on commodity hardware. The architecture distributes computation and memory usage across the GPU, CPU, and disk in order to maximize the number of tokens processed during inference. This design allows organizations to deploy powerful language models for high-volume tasks without the infrastructure costs typically associated with large-scale AI systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Tracy Profiler

    Tracy Profiler

    Frame profiler

    A real-time, nanosecond resolution, remote telemetry, hybrid frame, and sampling profiler for games and other applications. Tracy supports profiling CPU (Direct support is provided for C, C++, Lua and Python integration. At the same time, third-party bindings to many other languages exist on the internet, such as Rust, Zig, C#, OCaml, Odin, etc.), GPU (All major graphic APIs: OpenGL, Vulkan, Direct3D 11/12, OpenCL.), memory allocations, locks, context switches, automatically attribute screenshots to captured frames, and much more.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Lightweight ring buffer manager

    Lightweight ring buffer manager

    Lightweight generic ring buffer manager library

    The library provides generic FIFO ring buffer implementation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    GPU Hot

    GPU Hot

    Real-time NVIDIA GPU dashboard

    ...The project offers a self-hosted web interface that streams hardware metrics directly from GPU servers, enabling developers, ML engineers, and system administrators to observe GPU utilization and system behavior in real time through a browser. The dashboard collects and displays a wide range of performance metrics including temperature, memory usage, power consumption, clock speeds, fan speed, and active processes. It can scale from monitoring a single GPU workstation to large distributed environments with dozens or even hundreds of GPUs by running lightweight containers on each node and aggregating the data centrally.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    LuxTTS

    LuxTTS

    A high-quality rapid TTS voice cloning model

    ...Its design emphasizes efficiency and practicality, fitting within modest GPU memory footprints.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    JMX Exporter

    JMX Exporter

    A process for exposing JMX Beans via HTTP for Prometheus consumption

    ...It can be also run as a standalone HTTP server and scrape remote JMX targets, but this has various disadvantages, such as being harder to configure and being unable to expose process metrics (e.g., memory and CPU usage). Running the exporter as a Java agent is strongly encouraged.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    leak-check

    leak-check

    Personal Information “Leakage ” Detection Interface

    leak-check is a utility designed to help developers detect memory leaks and resource mismanagement in applications. It provides tools to monitor allocations, track usage patterns, and identify potential leaks during runtime. The project focuses on improving application stability by highlighting inefficiencies in memory handling. It can be integrated into development workflows to catch issues early in the debugging process. leak-check is particularly useful in performance-critical applications where memory management is essential. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    shimmy

    shimmy

    Python-free Rust inference server

    ...This compatibility enables developers to replace remote AI services with locally hosted models while keeping their existing software architecture intact. Shimmy focuses on performance and simplicity, using efficient runtime components to minimize memory usage and startup time compared to heavier inference frameworks. It supports modern model formats such as GGUF and SafeTensors and can automatically discover models stored locally or in common directories used by other AI tools. Advanced capabilities include CPU offloading for Mixture-of-Experts models and GPU acceleration, enabling large models to run on consumer hardware with limited VRAM.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    AirLLM

    AirLLM

    AirLLM 70B inference with single 4GB GPU

    AirLLM is an open source Python library that enables extremely large language models to run on consumer hardware with very limited GPU memory. The project addresses one of the main barriers to local LLM experimentation by introducing a memory-efficient inference technique that loads model layers sequentially rather than storing the entire model in GPU memory. This layer-wise inference approach allows models with tens of billions of parameters to run on devices with only a few gigabytes of...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 10
    EasyExcel

    EasyExcel

    Lightweight Java library developed by Alibaba for reading and writing

    EasyExcel is a Java library focused on reading and writing Excel files with very low memory usage, making it suitable for large datasets that overwhelm traditional APIs. It uses streaming/event-driven parsing to avoid loading entire workbooks into memory, and it maps rows to Java objects via simple annotations. Writers support multiple sheets, custom styles, merged cells, and template-based filling so production reports remain maintainable.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Hello-Agents

    Hello-Agents

    Building an Intelligent Agent from Scratch

    Hello Agents is an open educational project designed to teach developers how to understand, design, and build AI-native agents from the ground up through structured tutorials and practical examples. The project focuses on guiding learners beyond superficial framework usage toward deeper comprehension of agent architecture, reasoning loops, and real-world implementation patterns. It walks users through core concepts such as ReAct-style reasoning, tool usage, memory handling, and multi-step task execution, enabling hands-on experimentation with modern LLM-powered agent systems. The repository is structured as a progressive learning path, combining theory, exercises, and runnable code so users can incrementally build more capable agents. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    PortableGL

    PortableGL

    An implementation of OpenGL 3.x-ish in clean C

    PortableGL is a single-header, software-only implementation of a subset of OpenGL (specifically the GL 2.1 pipeline), designed to run entirely on the CPU. This lightweight graphics library allows OpenGL-style rendering without GPU acceleration, making it ideal for educational use, debugging, embedded systems, and retro-style software rendering. Because it mirrors OpenGL syntax and design, it can act as a drop-in CPU renderer for testing or deploying 3D graphics on platforms without GPU support.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    NanoBoyAdvance

    NanoBoyAdvance

    A cycle-accurate Nintendo Game Boy Advance emulator

    NanoBoyAdvance is a cycle-accurate Game Boy Advance emulator that prioritizes precision and correctness in replicating original hardware behavior. It is designed to emulate the GBA at a very low level, including CPU timing, DMA operations, graphics processing, and memory behavior, ensuring that even edge cases and obscure hardware quirks are faithfully reproduced. The emulator achieves extremely high compatibility, passing multiple hardware test suites and accurately running games that rely on precise timing conditions. In addition to accuracy, it introduces enhancements such as a high-quality audio mixer that improves sound output without altering internal emulation behavior. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Stopwatch Component

    Stopwatch Component

    Provides a way to profile code

    Symfony Stopwatch is a lightweight component designed to measure the time and memory usage of code execution. It helps developers profile their applications by tracking the duration of specific operations and monitoring performance. Stopwatch is ideal for identifying bottlenecks and optimizing performance, especially in complex Symfony applications. It can also be used independently in any PHP project to measure the efficiency of code segments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Pocket TTS

    Pocket TTS

    A TTS that fits in your CPU (and pocket)

    Pocket TTS is a lightweight text-to-speech project designed to run efficiently on CPUs, targeting developers who want local speech generation without depending on GPUs or hosted web APIs. It is built to feel practical in everyday applications, where installation and usage should be as simple as adding a dependency and calling a function. The project focuses on keeping the runtime footprint manageable while still producing natural-sounding speech, which makes it attractive for offline tools, prototypes, and privacy-sensitive workflows. Because it is CPU-oriented, it fits well in server environments where GPU access is limited, in desktop apps, or in edge deployments where simplicity matters more than maximum throughput. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Dozzle

    Dozzle

    Realtime log viewer for containers. Supports Docker, Swarm and K8s

    ...The interface includes practical quality-of-life features like fuzzy searching for containers, regex log search, split-screen viewing for multiple logs, and live stats such as CPU and memory usage. It supports more advanced analysis through an in-browser SQL query engine for querying logs, which helps when you need structured filtering without exporting data elsewhere. Dozzle also supports multi-user authentication and can integrate with proxy-forward authorization setups for deployments behind gateways. For larger environments, it can monitor multiple hosts via an agent mode and supports orchestrators like Docker Swarm and Kubernetes.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 17
    bitnet.cpp

    bitnet.cpp

    Official inference framework for 1-bit LLMs

    bitnet.cpp is the official open-source inference framework and ecosystem designed to enable ultra-efficient execution of 1-bit large language models (LLMs), which quantize most model parameters to ternary values (-1, 0, +1) while maintaining competitive performance with full-precision counterparts. At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 18
    MemU

    MemU

    MemU is an open-source memory framework for AI companions

    MemU is an agentic memory layer for LLM applications, specifically designed for AI companions. Transform your memory into an intelligent file system that automatically organizes, connects, and evolves with your memories. Simple, fast, and reliable memory infrastructure for AI applications. Powerful tools and dedicated support to scale your AI applications with confidence.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    QuantumOptics.jl

    QuantumOptics.jl

    Library for the numerical simulation of closed as well as open quantum

    QuantumOptics.jl is a numerical framework written in the Julia programming language that makes it easy to simulate various kinds of open quantum systems. It is inspired by the Quantum Optics Toolbox for MATLAB and the Python framework QuTiP. QuantumOptics.jl optimizes processor usage and memory consumption by relying on different ways to store and work with operators. The framework comes with a plethora of pre-defined systems and interactions making it very easy to focus on the physics, not on the numerics. Every function in the framework has been severely tested with all tests and their code coverage presented on the framework's GitHub page.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    KVCache-Factory

    KVCache-Factory

    Unified KV Cache Compression Methods for Auto-Regressive Models

    ...In large language models, the key-value cache stores intermediate attention states that enable efficient token generation during inference, but these caches can consume large amounts of GPU memory when handling long contexts. KVCache-Factory provides a platform for implementing and evaluating multiple compression strategies that reduce memory usage while preserving model performance. The framework integrates several state-of-the-art methods such as PyramidKV, SnapKV, H2O, and StreamingLLM, allowing researchers to compare and experiment with different approaches within the same environment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    xLSTM

    xLSTM

    Neural Network architecture based on ideas of the original LSTM

    ...The architecture aims to provide competitive performance with transformer-based models while maintaining advantages such as linear computational scaling and efficient memory usage for long sequences. Researchers have demonstrated that xLSTM models can scale to billions of parameters and large training datasets while maintaining efficient inference speeds.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PicoLM

    PicoLM

    Run a 1-billion parameter LLM on a $10 board with 256MB RAM

    PicoLM is an open-source inference framework designed to run large language models on extremely constrained hardware environments such as inexpensive single-board computers and embedded systems. The project focuses on enabling efficient local inference by optimizing memory usage, computation, and system dependencies so that relatively large models can operate on devices with minimal RAM. It is written primarily in C and designed with a minimalist architecture that removes unnecessary dependencies and external libraries. The runtime is capable of running language models with billions of parameters on devices with only a few hundred megabytes of memory, which is significantly lower than typical LLM infrastructure requirements. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Dockhand

    Dockhand

    Docker management you will like

    ...Designed for homelab enthusiasts, developers, and growing teams, Dockhand offers real-time container lifecycle controls, visual editors for stacks, and a dashboard that shows system metrics like CPU and memory usage. The platform supports Git integration for deploying and syncing Compose stacks directly from repositories, interactive log streaming, and shell access into containers. It also includes tools for managing images, volumes, networks, and container events, making it a comprehensive alternative to traditional command-line workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Tencent-Hunyuan-Large

    Tencent-Hunyuan-Large

    Open-source large language model family from Tencent Hunyuan

    ...It is designed with long-context capabilities, quantization support, and high performance on benchmarks across general reasoning, mathematics, language understanding, and Chinese / multilingual tasks. It aims to provide competitive capability with efficient deployment and inference. FP8 quantization support to reduce memory usage (~50%) while maintaining precision. High benchmarking performance on tasks like MMLU, MATH, CMMLU, C-Eval, etc.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    OpenAI CS Agents Demo

    OpenAI CS Agents Demo

    Demo of a customer service use case implemented with the OpenAI Agents

    ...It also demonstrates guardrails to validate or constrain responses, memory usage to maintain context, and tracing to help debugging of workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB