Why use many token when few token do trick
A theoretical reconstruction of the Claude Mythos architecture
Persistent context and multi-instance coordination
GPU accelerated decision optimization
An efficient forwarding service designed for LLMs
MoBA: Mixture of Block Attention for Long-Context LLMs
A Unified Framework for Image Customization
Pruna is a model optimization framework built for developers
Superfast AI decision making and processing of multi-modal data
Standardized Serverless ML Inference Platform on Kubernetes
ChatGPT interface with better UI
Low-latency AI inference engine optimized for mobile devices
Local AI coding agent CLI with multi-agent orchestration tools
Parallax is a distributed model serving framework
Build and run agents you can see, understand and trust
A Model Context Protocol (MCP) Gateway & Registry
The most accurate natural language detection library for Python
ZAPI by Adopt AI is an open-source Python library
Ultimate meta-skill for generating best-in-class Claude Code skills
A step-by-step guide to build your own AI agent
Large-language-model & vision-language-model based on Linear Attention
AI Suite for upscaling, interpolating & restoring images/videos
Building Mixture-of-Experts from LLaMA with Continual Pre-training
e-Dokyumento is web-based Document Management System (DMS)
A PyTorch implementation of "Capsule Graph Neural Network"