Persistent HTTP cache for python requests
A caching extension for Flask
Redundancy-aware KV Cache Compression for Reasoning Models
Semantic cache for LLMs. Fully integrated with LangChain
Unified KV Cache Compression Methods for Auto-Regressive Models
Cache-Augmented Generation: A Simple, Efficient Alternative to RAG
Supercharge Your LLM with the Fastest KV Cache Layer
A simple app that provides django integration for RQ
Binance Exchange API python implementation for automated trading
Export Django monitoring metrics for Prometheus.io
A blog system based on python3.8 and Django3.0
Flet enables developers to easily build realtime web and mobile apps
Small tool to download PS1/PS2 covers for DuckStation and PCSX2
Python cleanup script for macOS
High-performance Inference and Deployment Toolkit for LLMs and VLMs
The comprehensive WSGI web application library
The Web framework for perfectionists with deadlines
a pluggable app that runs a full check on the deployment
RGBD video generation model conditioned on camera input
Install and manage a high performance WordPress stack
Ansible module to manage packages from the AUR
Bring the notion of Model-as-a-Service to life
A Model Context Protocol (MCP) Gateway & Registry
Claude Code, but it runs on your Mac for free
A course of learning LLM inference serving on Apple Silicon