uzu is a high-performance inference engine designed to run artificial intelligence models efficiently on Apple Silicon hardware. Written primarily in Rust and leveraging Apple’s Metal framework, the project focuses on maximizing performance when executing large language models and other AI workloads on devices such as Mac computers with M-series chips. The engine implements a hybrid architecture in which model layers can be executed either as custom GPU kernels or through Apple’s MPSGraph API, allowing it to balance performance and compatibility depending on the workload. By utilizing Apple’s unified memory architecture, uzu reduces memory copying overhead and improves inference throughput for local AI workloads. The system includes a simple high-level API that enables developers to run models, create inference sessions, and generate outputs with minimal configuration.

Features

  • High-performance inference engine optimized for Apple Silicon hardware
  • Hybrid execution architecture combining GPU kernels and MPSGraph computation
  • Unified memory utilization for efficient model execution on Apple devices
  • High-level API for creating inference sessions and running AI models
  • Command-line interface for running models, serving APIs, and benchmarking performance
  • Language bindings for Swift and Node.js enabling integration into applications

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow uzu

uzu Web Site

Other Useful Business Software
Gemini 3 and 200+ AI Models on One Platform Icon
Gemini 3 and 200+ AI Models on One Platform

Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of uzu!

Additional Project Details

Programming Language

Rust

Related Categories

Rust Large Language Models (LLM)

Registered

4 days ago