Persistent HTTP cache for python requests
A caching extension for Flask
Redundancy-aware KV Cache Compression for Reasoning Models
Semantic cache for LLMs. Fully integrated with LangChain
Unified KV Cache Compression Methods for Auto-Regressive Models
Cache-Augmented Generation: A Simple, Efficient Alternative to RAG
Supercharge Your LLM with the Fastest KV Cache Layer
A simple app that provides django integration for RQ
Binance Exchange API python implementation for automated trading
Export Django monitoring metrics for Prometheus.io
A blog system based on python3.8 and Django3.0
Python cleanup script for macOS
High-performance Inference and Deployment Toolkit for LLMs and VLMs
The comprehensive WSGI web application library
The Web framework for perfectionists with deadlines
Small tool to download PS1/PS2 covers for DuckStation and PCSX2
Flet enables developers to easily build realtime web and mobile apps
a pluggable app that runs a full check on the deployment
RGBD video generation model conditioned on camera input
Install and manage a high performance WordPress stack
Ansible module to manage packages from the AUR
Bring the notion of Model-as-a-Service to life
Claude Code, but it runs on your Mac for free
A course of learning LLM inference serving on Apple Silicon
A high-performance algorithmic trading platform