cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
Features
- Extendable framework for detection strategies
- Byte order mark detection
- ASCII detection
- Guessing strategy (jchartdet, based on the mozilla code page detection)
- XML header detection
- HTML header detection
- Command line interface for transcoding / detecting / sorting (by codepage) trees of files
- See comparison: http://fredeaker.blogspot.com/2007/01/character-encoding-detection.html
- Fast: http://tinyurl.com/cpdetector-icu-performance
License
Mozilla Public License 1.1 (MPL 1.1)Follow cpDetector
Other Useful Business Software
Auth0 B2B Essentials: SSO, MFA, and RBAC Built In
Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of cpDetector!