pangu.py is a Python port of the Pangu spacing tool that automatically inserts proper whitespace between CJK characters and Latin letters, numbers, or symbols. Mixed-script text often becomes cramped or ambiguous, and this library applies simple but effective typography rules to make it instantly more readable. It works both as a Python library and a command-line utility, so you can process strings in code, tidy files in bulk, or wire it into documentation and build pipelines. The transformation is idempotent: running it multiple times won’t keep adding spaces, which makes it safe in automated workflows. It’s designed to be pragmatic and lightweight, with sensible defaults that handle common edge cases found in websites, blogs, and multilingual technical docs. Because it targets clarity over heavy linguistic analysis, it’s easy to adopt and delivers immediate, visible improvements to mixed CJK/Latin text.
Features
- Automatic spacing rules between CJK and Latin, digits, and symbols
- Library API for strings plus a CLI for files and streams
- Idempotent behavior to prevent double-spacing on repeated runs
- Works in scripts, CI pipelines, and static-site generators
- Minimal dependencies and quick setup in Python environments
- Sensible defaults with predictable, readable output