|
From: Greg W. <gw...@me...> - 2002-10-08 16:27:19
|
I'm having problems validating HTML output by Docutils using nsgmls, my HTML/XHTML validator of choice. (It's from the James Clark's SP package -- installed on my Debian "unstable" box as sp_1.3.4-1.2.1-28.) I'm sure you're all familiar with the first two lines of Docutils HTML output: $ head -2 upload.html <?xml version="1.0" encoding="us-ascii"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> If I ask nsgmls to parse this (I just use it for the errors and warnings it prints), it barfs almost immediately: $ nsgmls upload.html nsgmls:upload.html:2:63:E: name start character invalid: only delimiter ">", delimiter "[", system identifier and parameter separators are allowed nsgmls:upload.html:2:63:E: cannot continue because of previous errors ?xml version="1.0" encoding="us-ascii"? Line 2 column 63 is the word "SYSTEM". I don't know much about the hairy minutiae of (X)HTML DTD lines, so I can't tell offhand if this is valid -- I don't recall seeing other DTD lines with it, but that's not saying much. So what's up here -- is Docutils emitting bad HTML here? or is nsgmls barfing on valid HTML? Greg -- Greg Ward - software developer gw...@me... MEMS Exchange http://www.mems-exchange.org |
|
From: David G. <go...@us...> - 2002-10-09 01:09:20
|
Greg Ward wrote: > I'm having problems validating HTML output by Docutils using nsgmls, > my HTML/XHTML validator of choice. (It's from the James Clark's SP > package -- installed on my Debian "unstable" box as > sp_1.3.4-1.2.1-28.) Brings back memories. I used SP/nsgmls and their predecessor sgmls intensively for 3 years, back when I lived in Japan. > I'm sure you're all familiar with the first two lines of Docutils > HTML output: > > $ head -2 upload.html > <?xml version="1.0" encoding="us-ascii"?> > <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" > SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> > > If I ask nsgmls to parse this (I just use it for the errors and warnings > it prints), it barfs almost immediately: ... > So what's up here -- is Docutils emitting bad HTML here? or is > nsgmls barfing on valid HTML? That was just a thinko. I thought the word "SYSTEM" was needed there, but it isn't. I had the same bug with the XML writer (docutils_xml.py), fixed recently. I've fixed the DOCTYPE header for HTML; please try it again with the latest snapshot. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
|
From: Greg W. <gw...@me...> - 2002-10-21 16:40:59
|
On 08 October 2002, David Goodger said: > That was just a thinko. I thought the word "SYSTEM" was needed there, > but it isn't. I had the same bug with the XML writer > (docutils_xml.py), fixed recently. I've fixed the DOCTYPE header for > HTML; please try it again with the latest snapshot. OK, I finally got around to cvs up'ing my docutils installation. nsgmls happily validates all of the Quixote documentation (http://www.mems-exchange.org/software/quixote/doc/), with one exception. In particular, this reStructuredText snippet: """ ... The xmlrpclib module, part of the Python 2.2 standard library and available separately from http://www.pythonware.com/products/xmlrpc/, converts between Python's standard data types and the XML-RPC data types. ============== ===================== XML-RPC Type Python Type or Class -------------- --------------------- <int> int [...] ============== ===================== """ generates this HTML: """ The xmlrpclib module, part of the Python 2.2 standard library and available separately from <a class="reference" href="http://www.pythonware.com/products/xmlrpc/">http://www.pythonware.com/products/xmlrpc/</a>, converts between Python's standard data types and the XML-RPC data types.</p> <table class="table" frame="border" rules="all"> <colgroup> <col colwidth="40%" /> <col colwidth="60%" /> </colgroup> <tbody valign="top"> <tr><td>XML-RPC Type</td> <td>Python Type or Class</td> </tr> <tr><td><int></td> <td>int</td> </tr> <tr><td><double></td> [...] </tbody> </table> """ If I run nsgmls on the whole HTML file, I get: $ nsgmls -wxml -c /www/misc/sgml/xhtml.soc web-services.html > /dev/null nsgmls:web-services.html:26:14:E: there is no attribute "colwidth" Line 26 is the first "<col>" tag inside "<colgroup>". Oh yeah, the /www/misc/sgml/xhtml.soc file is just something I need to pass to nsgmls to stop it from spewing bazillions of meaningless (to me) errors. I'm no SGML/XML expert, so that's the limit of my understanding. Greg -- Greg Ward - software developer gw...@me... MEMS Exchange http://www.mems-exchange.org |
|
From: David G. <go...@us...> - 2002-10-22 01:05:31
|
Greg Ward wrote: > OK, I finally got around to cvs up'ing my docutils installation. > nsgmls happily validates all of the Quixote documentation > (http://www.mems-exchange.org/software/quixote/doc/), with one > exception. The docs look good. Please note that if you regenerate the docs with the latest Docutils (which I recommend you do, to take advantage of the improvements to the HTML produced), you should also replace the stylesheet. It looks like you've made some modifications to the stock stylesheet, which is fine. To keep your modifications *and* get the benefit of the recent improvements, without having to edit the stylesheet each time, I recommend extracting your modifications into a separate .css file and using the "@import" statement to cascade the stylesheets. See http://docutils.sf.net/docs/tools.html#stylesheets for details. BTW, how about a plug? If you run html.py with the --generator option (or add "generator: 1" to your docutils.conf file), you'll get a discrete "Generated by Docutils" credit at the bottom of the file. I'll say no more. :-) > If I run nsgmls on the whole HTML file, I get: > > $ nsgmls -wxml -c /www/misc/sgml/xhtml.soc web-services.html > /dev/null > nsgmls:web-services.html:26:14:E: there is no attribute "colwidth" > > Line 26 is the first "<col>" tag inside "<colgroup>". A typo, fixed now. The HTML attribute is "width", not "colwidth". I must have just copied the attribute from the the Docutils internal table model, where the attribute *is* "colwidth", without realizing. (Docutils uses a slightly customized "OASIS XML Exchange Table Model", based on CALS tables; DTD in spec/soextblx.dtd.) That bugfix actually solves a minor problem with HTML table rendering (the width suggestions weren't being heeded), which I had forgotten about. Thanks! > Oh yeah, the /www/misc/sgml/xhtml.soc file is just something I need > to pass to nsgmls to stop it from spewing bazillions of meaningless > (to me) errors. I'm no SGML/XML expert, so that's the limit of my > understanding. The .soc extension is used on "SGML Open Catalog" files. They provide local mappings from PUBLIC ID (start with "-//W3C//DTD..." in xhtml.soc) to SYSTEM IDs (local filesystem paths). I thought XML didn't use them, but it could be a legacy thing since nsgmls was an SGML parser before XML existed. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
|
From: Greg W. <gw...@me...> - 2002-10-22 15:10:22
|
On 21 October 2002, David Goodger said: > The docs look good. Please note that if you regenerate the docs with > the latest Docutils (which I recommend you do, to take advantage of > the improvements to the HTML produced) OK, done. > you should also replace the > stylesheet. It looks like you've made some modifications to the stock > stylesheet, which is fine. To keep your modifications *and* get the > benefit of the recent improvements, without having to edit the > stylesheet each time, I recommend extracting your modifications into a > separate .css file and using the "@import" statement to cascade the > stylesheets. See http://docutils.sf.net/docs/tools.html#stylesheets > for details. OK, tried that. Problem: the main modification I made was to completely *remove* your styles for the 'a' and 'tt' tags. (I think link colouring should be up to the browser, and I don't like a background on inline literals.) So how do I override your stylesheet with removal information? I tried tt { } but that didn't work. Didn't really expect it to. Greg -- Greg Ward - software developer gw...@me... MEMS Exchange http://www.mems-exchange.org |
|
From: David G. <go...@us...> - 2002-10-23 01:31:26
|
[David] >> The docs look good. Please note that if you regenerate the docs with >> the latest Docutils (which I recommend you do, to take advantage of >> the improvements to the HTML produced) [Greg] > OK, done. The sources on the web site still say "Docutils 0.2.4". The current version is 0.2.7. Not pushed out yet? >> you should also replace the stylesheet. ... I recommend extracting >> your modifications into a separate .css file and using the >> "@import" statement to cascade the stylesheets. See >> http://docutils.sf.net/docs/tools.html#stylesheets for details. > > OK, tried that. Problem: the main modification I made was to > completely *remove* your styles for the 'a' and 'tt' tags. (I think > link colouring should be up to the browser, and I don't like a > background on inline literals.) > > So how do I override your stylesheet with removal information? You can set the <tt> style back to its initial value:: tt { background-color: transparent } As for the <a> tags, these are the styles specified:: a.target { color: blue } a.toc-backref { text-decoration: none ; color: black } The first is easily undone:: a.target { color: inherit } In fact, I think I'll remove the "a.target" style from the project's default.css. It was useful for diagnostics, but implies meaning where there really is none. And it's distracting. ... Gone now. I don't know of any way to undo the second set of styles ("a.toc-backref"). But they're only applied to back-links from section headers to a table of contents. If you have no table of contents, or specify "--no-toc-backlinks" (or "toc_backlinks: none" in the config file), that style will have no effect. These styles remove the typical hyperlink formatting (color + underline), to make the back-linked section headers look like regular section headers. An approximation to undoing the style would be:: a.toc-backref { text-decoration: underline ; color: blue } However, the browser itself or user settings may specify a different initial color and/or decoration, and the color should change once the hyperlink is visited. These can also be specified (using the ":link" and ":visited" pseudo-classes), but that just makes the whole thing even more complicated. Is there any way to *disable* styles that don't inherit? Any way to say "use or restore the *initial* value for this style, ignoring any later explicit styles"? I can't find any. -- David Goodger <go...@us...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
|
From: Greg W. <gw...@me...> - 2002-10-23 13:43:50
|
On 22 October 2002, David Goodger said:
> The sources on the web site still say "Docutils 0.2.4". The current
> version is 0.2.7. Not pushed out yet?
Correct -- I'm just playing on my development web server.
> You can set the <tt> style back to its initial value::
>
> tt { background-color: transparent }
Yup, that works.
> In fact, I think I'll remove the "a.target" style from the project's
> default.css. It was useful for diagnostics, but implies meaning where
> there really is none. And it's distracting. ... Gone now.
OK, I'll cvs up and stop worrying about the "a" styles then.
Thanks!
Greg
--
Greg Ward - software developer gw...@me...
MEMS Exchange http://www.mems-exchange.org
|