From: Arnout E. <no...@bz...> - 2011-10-21 09:29:26
|
Hello, The global field 'use_mb' determines whether to use multibyte-aware functions[1]. It seems this field is actually usually set to 'true', except when the locale is unless the locale is 'C', 'POSIX', Statefull (whatever that means - anyway we don't support it) or erronous. This interpretation of 'use_mb' seems to be a little too strict: right now use_mb is also being used to decide which functions to use when reading external strings, like between 'XTextPropertyToStringList' and 'XmbTextPropertyToTextList' in xwindow_get_text_property. This means xwindow_get_text_property will not be able to read UTF-8 text properties when using the 'C' or 'POSIX' locales, even though XmbTextPropertyToTextList can and will convert them to the current locale encoding (stripping any mb characters). I think it makes sense to change the meaning of 'use_mb' to mean 'use multibyte-aware functions when handling strings that are in the current locale encoding (such as all our internal strings). For external strings, we should always use multibyte-aware functions to convert the external strings to the internal locale-dependent encoding. As for a specific use case, the current use of use_mb causes the (UTF-8 EWMH) _NET_WM_NAME property to be ignored when using a 'C' or 'POSIX' locale. This is especially problematic when there is no useful XA_WM_NAME to fall back on, which is ugly anyway but is reported to happen sometimes. Any objections to changing the semantics of 'use_mb' in this way? Kind regards, Arnout [1]: for example for deciding between '!iswprint(str_wchar_at(buf, 32))' and 'iscntrl(*buf)' or deciding between 'XmbTextListToTextProperty' and 'XStringListToTextProperty'. |