The challenge is to run Stable Diffusion 1.5, which includes a large transformer model with almost 1 billion parameters, on a Raspberry Pi Zero 2, which is a microcomputer with 512MB of RAM, without adding more swap space and without offloading intermediate results on disk. The recommended minimum RAM/VRAM for Stable Diffusion 1.5 is typically 8GB. Generally, major machine learning frameworks and libraries are focused on minimizing inference latency and/or maximizing throughput, all of which at the cost of RAM usage. So I decided to write a super small and hackable inference library specifically focused on minimizing memory consumption: OnnxStream. OnnxStream is based on the idea of decoupling the inference engine from the component responsible for providing the model weights, which is a class derived from WeightsProvider. A WeightsProvider specialization can implement any type of loading, caching, and prefetching of the model parameters.

Features

  • OnnxStream can consume even 55x less memory than OnnxRuntime with only a 50% to 200% increase in latency
  • Documentation available
  • OnnxStream is based on the idea of decoupling the inference engine from the component responsible of providing the model weights
  • Major machine learning frameworks and libraries are focused on minimizing inference latency
  • Examples available
  • The OnnxStream Stable Diffusion example implementation now supports SDXL 1.0

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow OnnxStream

OnnxStream Web Site

Other Useful Business Software
$300 in Free Credit Towards Top Cloud Services Icon
$300 in Free Credit Towards Top Cloud Services

Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
Get Started
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of OnnxStream!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

C++

Related Categories

C++ Machine Learning Software, C++ Raspberry Pi Software, C++ LLM Inference Tool

Registered

2024-08-14