#123 Unicode escaped characters (in dot graphs)

v3.0
closed-fixed
Edward Loper
5
2006-08-29
2006-08-25
Anonymous
No

Hi,

I'm using version 3.0 alpha 3.

I've experienced epydoc crashes when generating docs
for my project.

My source files are all ascii encoded, but as I write
comments in French, I use some escaped unicode
charaters like '\xe9' (é) or '\xe0' (à) in my docstrings.

The epydoc command raises a unicode exception
(something like 'ascii' codec can't decode character
'0xc3'...) at line 761 in module epydoc.docwriter.html:

out('<center>\n%s</center>\n' % self.render_graph(graph))

I've been able to correct this error by making the
following changes:

- in module epydoc.docwriter.html, line 619
f = codecs.open(path, 'w', 'ascii',
errors='xmlcharrefreplace')
becomes:
f = codecs.open(path, 'w', 'utf-8',
errors='xmlcharrefreplace')

- in module epydoc.docwriter.dotgraph, line 182
return s
becomes
return s.decode('utf-8')

This lets me generate the html pages without exception
but I still have to manually change the encoding in my
browser.

In order to correct this I've also replaced every line
I've found like
<?xml version="1.0" encoding="ascii_or_iso-8859-1"?>
to
<?xml version="1.0" encoding="utf-8"?>
in modules:
- epydoc.html
- epydoc.docwriter.html

Now it works great. UML generation is a wonderful feature.

I hope it can help. Many thanks for your great job.

Olivier Thiery
olivier.thiery at ineo-orrma.fr

Discussion

  • Edward Loper
    Edward Loper
    2006-08-29

    Logged In: YES
    user_id=195958

    Would you mind attaching a small file that causes the bug
    you describe? Thanks.

     
  • Edward Loper
    Edward Loper
    2006-08-29

    Logged In: YES
    user_id=195958

    Thanks for the bug report.

    Fixed in subversion revision 1328. (The cmapx returned by
    dot needed to be decoded (using utf-8, since that's what
    dot uses for all i/o) before being used.)

    I also fixed a (somewhat) related bug in subversion
    revision 1329.

     
  • Edward Loper
    Edward Loper
    2006-08-29

    • status: open --> closed
     
  • Edward Loper
    Edward Loper
    2006-08-29

    • summary: Unicode escaped characters --> Unicode escaped characters (in dot graphs)
    • status: closed --> closed-fixed