Re: [xindy] Xindy and UTF8 in rule files
Brought to you by:
jschrod
From: Joachim S. <js...@ac...> - 2007-12-28 13:40:01
|
>>>>> "SS" == Simon Spiegel writes: SS> On 26.12.2007, at 10:17, Thomas Henlich wrote: >> Simon Spiegel schrieb: >>> In my understanding this is because the rules file is UTF8. I'm not >>> really sure whether xindy doesn't handle UTF8 rules at all >> >> did you take a look at the xindy-make-rules package? It probably does >> exactly the sort of thing you want to do, or very similar. SS> Thanks, I actually had a look at make-rules and tried with a UTF8- SS> german rule, but this didn't seem to work. From my understanding SS> (which may be wrong) this rule helps if your .ind-File contains UTF8 SS> characters Yes, that's the case. SS> but in my case the rules file contained UTF8 and xindy choked on SS> that. Principally, xindy should handle rule files with UTF-8 chars in strings. I tried it some time ago, even though I didn't try your example now. Where xindy chokes is when the UTF-8 file starts with a Byte Order Mark (BOM), see http://en.wikipedia.org/wiki/Byte-order_mark. I.e., when the file starts with the three bytes (0xef, 0xbb, 0xbf). Some editors add such a BOM to any UTF-8 file they create. This is a known deficiency where I don't know yet how I could handle it. (I can't ignore these bytes simply in the reader, they might appear as characters in strings.) Cheers, and best wishes for the new year, Joachim -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Joachim Schrod Email: js...@ac... xindy maintainer http://www.xindy.org/ Roedermark, Germany |