From: Ben F. <ben...@be...> - 2009-12-29 07:50:06
|
Howdy all, Docutils is reporting an error when trying to render a reST document containing some Unicode characters. I'm working on a larger document, but here is a minimal example showing the problem: ===== $ rst2html --version rst2html (Docutils 0.6 [release], Python 2.5.4, on linux2) $ cat refcard.txt ############################# Mahjong (麻將) reference card ############################# ============== == == == == == == == == == Suit 1 2 3 4 5 6 7 8 9 ============== == == == == == == == == == Numbers 萬子 � � � � � � � � � Coins 筒子 � � � � � � � � � Bamboos 索子 � � � � � � � � � ============== == == == == == == == == == $ rst2html refcard.txt > refcard.html refcard.txt:5: (ERROR/3) Malformed table. Text in column margin at line offset 3. ============== == == == == == == == == == Suit 1 2 3 4 5 6 7 8 9 ============== == == == == == == == == == Numbers 萬子 � � � � � � � � � Coins 筒子 � � � � � � � � � Bamboos 索子 � � � � � � � � � ============== == == == == == == == == == $ ===== So according to Docutils, the table has “text in column margin”, but as far as I can tell looking at the above source text, that's not true. The text shown above for the source document is valid UTF-8. It consists of punctuation, English-language alphanumerics, Chinese hanzi, and Mahjong tile characters. Possibly some of these are confusing to the Docutils processor. Is this a bug in Docutils? -- \ “The difference between religions and cults is determined by | `\ how much real estate is owned.” —Frank Zappa | _o__) | Ben Finney |
From: Chris G <cl...@is...> - 2009-12-29 11:06:13
|
On Tue, Dec 29, 2009 at 06:48:41PM +1100, Ben Finney wrote: > Howdy all, > > Docutils is reporting an error when trying to render a reST document > containing some Unicode characters. I'm working on a larger document, > but here is a minimal example showing the problem: > > ===== > $ rst2html --version > rst2html (Docutils 0.6 [release], Python 2.5.4, on linux2) > > $ cat refcard.txt > ############################# > Mahjong (麻將) reference card > ############################# > > ============== == == == == == == == == == > Suit 1 2 3 4 5 6 7 8 9 > ============== == == == == == == == == == > Numbers 萬子 � � � � � � � � � > Coins 筒子 � � � � � � � � � > Bamboos 索子 � � � � � � � � � > ============== == == == == == == == == == > > $ rst2html refcard.txt > refcard.html > refcard.txt:5: (ERROR/3) Malformed table. > Text in column margin at line offset 3. > > ============== == == == == == == == == == > Suit 1 2 3 4 5 6 7 8 9 > ============== == == == == == == == == == > Numbers 萬子 � � � � � � � � � > Coins 筒子 � � � � � � � � � > Bamboos 索子 � � � � � � � � � > ============== == == == == == == == == == > > $ > ===== > > So according to Docutils, the table has “text in column margin”, but as > far as I can tell looking at the above source text, that's not true. > > The text shown above for the source document is valid UTF-8. It consists > of punctuation, English-language alphanumerics, Chinese hanzi, and > Mahjong tile characters. Possibly some of these are confusing to the > Docutils processor. > No comment on whether a bug or not, but, as it's arrived here, the chinese characters in the heading (before reference card) and those in the 'Suit' column display correctly but in the number colums all I see is a ? on a black bacground. Is this what I should see or is there something awry (in UTF-8 terms) with the numeric column contents? > Is this a bug in Docutils? > -- Chris Green |
From: Ben F. <ben...@be...> - 2009-12-29 21:25:52
|
Chris G <cl...@is...> writes: > No comment on whether a bug or not, but, as it's arrived here, the > chinese characters in the heading (before reference card) and those in > the 'Suit' column display correctly but in the number colums all I see > is a ? on a black bacground. Is this what I should see or is there > something awry (in UTF-8 terms) with the numeric column contents? That's what you will see if your font doesn't have a glyph for representing a character. Since not many fonts contain glyphs for *every* Unicode character, that's expected, yes :-) -- \ “There's a fine line between fishing and standing on the shore | `\ looking like an idiot.” —Steven Wright | _o__) | Ben Finney |