Brett g Porter wrote:
> It's probably staring me in the face, but what's the correct reST
> way to get characters like HTML character entities?
Just type the characters into your input file, using whatever extended
character set input mechanism your OS has. Save the file using
whatever encoding you want (Latin-1, UTF-8, cp1252, unicode-escape,
etc.), and tell the Docutils tool what that encoding is
(-i/--input-encoding option).
The default output encoding, UTF-8, is well-supported and handles all
Unicode characters.
> I can't be the first to want to include "©" or "™" in a
> document.
Unfortunately, there's no 'html-character-entity' codec in the stdlib
[#]_. If you were to use such a codec though, you'd have to encode
every "&" as "&" (unless it was a forgiving codec). Does such a
codec exist?
.. [#] There is an "xmlcharrefreplace" error handler in Python 2.3,
but it only converts characters to "©" forms, not "©",
and only when encoding. I don't know of an official codec for
*decoding* character entities.
> I know that HTML isn't the only output format supported, so I'm not
> surprised that character entities aren't just passed through, I
> guess.
:)
--
David Goodger <go...@py...> Open-source projects:
- Python Docutils: http://docutils.sourceforge.net/
(includes reStructuredText: http://docutils.sf.net/rst.html)
- The Go Tools Project: http://gotools.sourceforge.net/
|