ds4.c

ds4.c is a specialized local inference engine created by antirez for running DeepSeek V4 Flash models directly on Apple Silicon hardware using Metal acceleration. Unlike general-purpose inference runtimes, the project is intentionally optimized for a specific model family, enabling highly efficient execution and simplified architecture. The engine includes DS4-specific model loading, KV cache management, prompt rendering, and OpenAI-compatible server APIs for local deployment workflows. Built as a native low-level implementation, it focuses on performance, reduced abstraction overhead, and direct integration with Apple GPU acceleration through Metal compute graphs. The project also supports streaming inference behavior and local API serving for integration with external tools and AI applications. Overall, ds4 represents a minimalist high-performance approach to running large language models locally without relying on heavyweight inference frameworks.

Features

Local DeepSeek V4 Flash inference engine
Metal-accelerated execution on Apple Silicon
OpenAI-compatible API server support
Specialized KV cache and prompt management
Native lightweight runtime architecture
Streaming local inference workflows

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow ds4.c

ds4.c Web Site

Other Useful Business Software

MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free

Rate This Project

User Reviews

Be the first to post a review of ds4.c!

Additional Project Details

Programming Language

Related Categories

C Artificial Intelligence Software

Registered

2026-05-08

Similar Business Software

LM-Kit.NET

LM-Kit.NET is a complete local AI runtime for .NET that lets engineering teams ship AI-powered features without cloud dependencies, per-token costs, or data leaving the network. Most .NET AI integrations stop at inference. LM-Kit.NET covers the full range of capabilities production...

See Software
Gemini Enterprise Agent Platform

Gemini Enterprise Agent Platform is a comprehensive solution from Google Cloud designed to help organizations build, scale, govern, and optimize AI agents. It represents the evolution of Vertex AI, combining advanced model development with new capabilities for agent orchestration and...

See Software
Google Compute Engine

Compute Engine is Google's infrastructure as a service (IaaS) platform for organizations to create and run cloud-based virtual machines. Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a...

See Software
LTX

Control every aspect of your video using AI, from ideation to final edits, on one holistic platform. We’re pioneering the integration of AI and video production, enabling the transformation of a single idea into a cohesive, AI-generated video. LTX empowers individuals to share their visions,...

See Software
Frontegg

Frontegg is a Customer Identity and Access Management (CIAM) platform that simplifies authentication, authorization, and user management for SaaS companies. It enables developers to implement advanced identity features quickly, then shift ongoing administration to other teams. With Frontegg,...

See Software
ONLYOFFICE Docs

ONLYOFFICE is an open-source project that offers cloud-based and self-hosted solutions for business of all sizes. The key product is ONLYOFFICE Docs, a secure office suite that seamlessly integrates into the most popular platforms, e.g. Odoo, Alfresco, Confluence, Pipedrive, Redmine, SuiteCRM...

See Software