From: Roland H. <ro...@lo...> - 2025-07-08 14:14:05
|
On 7/8/2025 7:57 AM, je...@fo... wrote: > Thanks for this wonderful background. 32-bit wide characters would > indeed incur enormous bloat, but thankfully, it seems UTF8 encoding > is brilliantly leaving most european langauges very close to 1-byte > per characters; even people in Korea, Japan, and China, UTF8 is never > bigger than 32-bit wide characters, but all punctuation, numbers, etc. > is mercifully as short as 1 byte. > > Now RAM and DISK space is cheaper than ever, but the biggest problem > was always software. UTF8 also makes software *mostly* able to deal > with wide characters w/o undue pain and suffering. Since I work mostly with embedded systems I've been working on a fork of CopperSpice (which is a fork of Qt 4.8 sans QML). One of the many things on my never ending TODO list is replace all of the QString classes with BdString classes based on UTF8-CPP. https://github.com/nemtrif/utfcpp?tab=readme-ov-file I just need to make certain the Boost 1.0 license is compatible. https://github.com/nemtrif/utfcpp/blob/master/LICENSE If you are going to pop the hood on strings, you probably want to look at that library. I don't remember if Fox is UTF-16 under the hood or pure UTF-8. Yes, UTF-8 has won. Both wide strings and UTF-16 have proven failures. Microsoft and a lot of legacy products are trapped with UTF-16. Too bad embedded systems customers don't like the old Motif look and feel Fox has. -- Roland Hughes, President Logikal Solutions (630)-205-1593 (cell) https://theminimumyouneedtoknow.com https://infiniteexposure.net https://johnsmith-book.com |