Showing 1 open source project for "preload"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    PowerInfer

    PowerInfer

    High-speed Large Language Model Serving for Local Deployment

    ...The project focuses on improving the performance of local AI inference by optimizing how neural network computations are distributed between CPU and GPU resources. Its architecture exploits the observation that only a subset of neurons in large models are frequently activated, allowing the system to preload frequently used neurons into GPU memory while processing less common activations on the CPU. This hybrid execution strategy significantly reduces memory bottlenecks and improves overall inference speed. PowerInfer incorporates specialized algorithms and sparse operators to manage neuron activation patterns and minimize data transfers between hardware components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB