missing context-type meta header
Brought to you by:
edloper
Generated HTML pages lack content-type meta header. While text/html is not really required, this header is used to specify charset of a page and that is much more important. For instance, if your documentation uses non-latin letters.
Ideally, header should look like this (in <head> section):
<meta http-equiv="Content-Type" content="text/html; charset=TAKEN-FROM-SOURCE-CODE" />
I'm not really sure how to retrieve source code encoding, but at least it must be possible with parsing. E.g. my Python files start with this line:
# -*- coding: utf-8 -*-
See Python 2.3 documentation, AFAIK encoding support was added in that version.
Logged In: YES
user_id=195958
Originator: NO
Epydoc attempts to ensure that *all* output it generates is 7-bin ASCII. If your python source files use non-ascii characters, then they'll be coverted to unicode when the module is parsed/introspected; and then those unicode characters will be rendered as html entities when the html is generated.
This seems to me to be the only sensible approach, given that it's possible to get docstrings from different modules, with potentially different encodings, on a single HTML page. (e.g., think of the module hierarchy page, which includes a summary description of each module.)
That said, I don't see how adding a meta-header that specifies charset as ASCII would hurt, so perhaps I should add it.
If you find that epydoc is not doing what it's supposed to -- i.e., if your non-ascii unicode characters are not getting rendered correctly -- then please send me an example file that generates the problem, and I'll look at what might be causing it.
Logged In: YES
user_id=1203127
Originator: YES
It seems that I didn't check after upgrading to Epydoc 3.0.0alpha3 from an old version. It still wrote no charset, so I decided there still was the problem with non-ASCII characters. (My script adds charset header at post-processing stage, so I could only detect that bug was fixed in Epydoc by looking into HTML source.) So, I close this bug as already fixed.
However, I'd suggest adding a command-line option that would fix charset to option's value for the whole Epydoc run. If you can convert arbitrary characters (presumably in different encodings) to HTML entities, you should be able to convert them to e.g. UTF-8 as well. While this is a minor feature, it would allow to decrease size of generated HTML pages if those contain many non-ASCII characters.