From: David A. <da...@bo...> - 2002-12-06 23:58:18
|
I just ran HTMLTidy over the output of python html.py test.txt test.html in the tools directory. It had lots of complaints, many of which look serious, though it only flagged them as warnings. In particular, I notice lots of characters which appear to be invalid (possibly nuls). I'm not an HTML expert, which is why I use Tidy. Are these worth doing something about? c:/src/docutils/tools/test.html:16:1: Warning: <meta> element not empty or not closed c:/src/docutils/tools/test.html:17:1: Warning: <meta> element not empty or not closed c:/src/docutils/tools/test.html:23:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:85:24: Warning: <a> Anchor "table-of-contents" already defined c:/src/docutils/tools/test.html:130:5: Warning: <a> Anchor "structural-elements" already defined c:/src/docutils/tools/test.html:132:5: Warning: <a> Anchor "section-title" already defined c:/src/docutils/tools/test.html:136:5: Warning: <a> Anchor "transitions" already defined c:/src/docutils/tools/test.html:143:5: Warning: <a> Anchor "body-elements" already defined c:/src/docutils/tools/test.html:145:5: Warning: <a> Anchor "paragraphs" already defined c:/src/docutils/tools/test.html:148:5: Warning: <a> Anchor "inline-markup" already defined c:/src/docutils/tools/test.html:158:66: Warning: <span> Anchor "id20" already defined c:/src/docutils/tools/test.html:172:5: Warning: <a> Anchor "bullet-lists" already defined c:/src/docutils/tools/test.html:195:5: Warning: <a> Anchor "enumerated-lists" already defined c:/src/docutils/tools/test.html:228:5: Warning: <a> Anchor "definition-lists" already defined c:/src/docutils/tools/test.html:241:5: Warning: <a> Anchor "field-lists" already defined c:/src/docutils/tools/test.html:242:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:259:5: Warning: <a> Anchor "option-lists" already defined c:/src/docutils/tools/test.html:261:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:305:5: Warning: <a> Anchor "literal-blocks" already defined c:/src/docutils/tools/test.html:316:5: Warning: <a> Anchor "block-quotes" already defined c:/src/docutils/tools/test.html:324:5: Warning: <a> Anchor "doctest-blocks" already defined c:/src/docutils/tools/test.html:333:5: Warning: <a> Anchor "tables" already defined c:/src/docutils/tools/test.html:335:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:378:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:414:5: Warning: <a> Anchor "footnotes" already defined c:/src/docutils/tools/test.html:415:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:418:23: Warning: <a> Anchor "id6" already defined c:/src/docutils/tools/test.html:424:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:427:23: Warning: <a> Anchor "label" already defined c:/src/docutils/tools/test.html:434:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:437:23: Warning: <a> Anchor "id9" already defined c:/src/docutils/tools/test.html:441:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:444:23: Warning: <a> Anchor "id10" already defined c:/src/docutils/tools/test.html:445:113: Warning: replacing invalid character code 128 c:/src/docutils/tools/test.html:448:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:451:23: Warning: <a> Anchor "id12" already defined c:/src/docutils/tools/test.html:451:72: Warning: replacing invalid character code 128 c:/src/docutils/tools/test.html:454:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:457:23: Warning: <a> Anchor "id13" already defined c:/src/docutils/tools/test.html:458:51: Warning: <span> Anchor "id62" already defined c:/src/docutils/tools/test.html:463:5: Warning: <a> Anchor "citations" already defined c:/src/docutils/tools/test.html:464:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:468:23: Warning: <a> Anchor "cit2002" already defined c:/src/docutils/tools/test.html:472:154: Warning: <span> Anchor "id64" already defined c:/src/docutils/tools/test.html:476:5: Warning: <a> Anchor "targets" already defined c:/src/docutils/tools/test.html:486:41: Warning: <span> Anchor "id66" already defined c:/src/docutils/tools/test.html:489:5: Warning: <a> Anchor "duplicate-target-names" already defined c:/src/docutils/tools/test.html:495:5: Warning: <a> Anchor "id18" already defined c:/src/docutils/tools/test.html:498:35: Warning: <span> Anchor "id68" already defined c:/src/docutils/tools/test.html:502:5: Warning: <a> Anchor "directives" already defined c:/src/docutils/tools/test.html:516:5: Warning: <a> Anchor "document-parts" already defined c:/src/docutils/tools/test.html:522:5: Warning: <a> Anchor "images" already defined c:/src/docutils/tools/test.html:530:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:552:5: Warning: <a> Anchor "admonitions" already defined c:/src/docutils/tools/test.html:589:5: Warning: <a> Anchor "target-footnotes" already defined c:/src/docutils/tools/test.html:590:1: Warning: <table> lacks "summary" attribute c:/src/docutils/tools/test.html:593:23: Warning: <a> Anchor "id21" already defined c:/src/docutils/tools/test.html:598:5: Warning: <a> Anchor "line-blocks" already defined c:/src/docutils/tools/test.html:617:5: Warning: <a> Anchor "replacement-text" already defined c:/src/docutils/tools/test.html:622:5: Warning: <a> Anchor "substitution-definitions" already defined c:/src/docutils/tools/test.html:627:5: Warning: <a> Anchor "comments" already defined c:/src/docutils/tools/test.html:638:5: Warning: <a> Anchor "error-handling" already defined c:/src/docutils/tools/test.html:647:50: Warning: <a> Anchor "id19" already defined c:/src/docutils/tools/test.html:650:50: Warning: <a> Anchor "id61" already defined c:/src/docutils/tools/test.html:653:50: Warning: <a> Anchor "id63" already defined c:/src/docutils/tools/test.html:656:50: Warning: <a> Anchor "id65" already defined c:/src/docutils/tools/test.html:659:50: Warning: <a> Anchor "id67" already defined c:/src/docutils/tools/test.html: Doctype given is "-//W3C//DTD XHTML 1.0 Transitional//EN" c:/src/docutils/tools/test.html: Document content looks like XHTML 1.0 Strict 67 warnings, 0 errors were found! Character codes 128 to 159 (U+0080 to U+009F) are not allowed in HTML; even if they were, they would likely be unprintable control characters. Tidy assumed you wanted to refer to a character with the same byte value in the Windows-1252 encoding and replaced that reference with the Unicode equivalent. The table summary attribute should be used to describe the table structure. It is very helpful for people using non-visual browsers. The scope and headers attributes for table cells are useful for specifying which headers apply to each table cell, enabling non-visual browsers to provide a meaningful context for each cell. For further advice on how to make your pages accessible see "http://www.w3.org/WAI/GL". You may also want to try "http://www.cast.org/bobby/" which is a free Web-based service for checking URLs for accessibility. To learn more about HTML Tidy see http://tidy.sourceforge.net -- David Abrahams da...@bo... * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution |
From: David G. <go...@py...> - 2002-12-07 03:32:01
|
David Abrahams wrote: > I just ran HTMLTidy over the output of > > python html.py test.txt test.html > > in the tools directory. It had lots of complaints, many of which > look serious, though it only flagged them as warnings. ... | 67 warnings, 0 errors were found! Warnings mean "I'm not sure, but I think something may be wrong here" and it's left to the user to judge. Many of these cases are not really problems, just HTMLTidy being overly critical. > In particular, I notice lots of characters which appear to be > invalid (possibly nuls). ... | Character codes 128 to 159 (U+0080 to U+009F) are not allowed in HTML That's because test.html is encoded in UTF-8. Looks like HTMLTidy doesn't understand the ``<?xml version="1.0" encoding="utf-8" ?>`` processing instruction at the beginning of the file. Try the "-utf8" option. | even if they were, they would likely be unprintable control | characters. Tidy assumed you wanted to refer to a character with the | same byte value in the Windows-1252 encoding and replaced that | reference with the Unicode equivalent. Dangerous assumption. > I'm not an HTML expert, which is why I use Tidy. Are these > worth doing something about? Some are, some aren't. | : Doctype given is "-//W3C//DTD XHTML 1.0 Transitional//EN" | : Document content looks like XHTML 1.0 Strict HTMLTidy doesn't understand the transitional DTD? Seems odd. | test.html:16:1: Warning: <meta> element not empty or not closed | :17:1: Warning: <meta> element not empty or not closed These are real errors, now corrected. | :23:1: Warning: <table> lacks "summary" attribute ... | The table summary attribute should be used to describe | the table structure. It is very helpful for people using | non-visual browsers. The scope and headers attributes for | table cells are useful for specifying which headers apply | to each table cell, enabling non-visual browsers to provide | a meaningful context for each cell. The "summary" attribute is not required by the HTML 4 spec, just recommended. While I sympathize with its aim, I don't know of any way to automatically generate a summary of a table. Eventually a formal "table" directive may be written, and it could have a "summary" option. | :85:24: Warning: <a> Anchor "table-of-contents" already defined This seems to be because both the "id" attribute of the container element and the "name" attribute of "<a>" elements are set to the same thing (as specified in Appendix C of the XHTML spec, http://www.w3.org/TR/xhtml1). HTML 4 and XHTML want elements to use the "id" attribute, but Netscape 4 only works with the "name" attribute on "<a>" tags. Perhaps the id and name attributes ought to be on the same element though... HTML is a mishmash; can't win. Unless there's a problem with a real browser (not just a tool like tidy), I don't see the need to fix this. Thanks for taking the time to run HTMLTidy and bring these to our attention. -- David Goodger <go...@py...> Open-source projects: - Python Docutils: http://docutils.sourceforge.net/ (includes reStructuredText: http://docutils.sf.net/rst.html) - The Go Tools Project: http://gotools.sourceforge.net/ |
From: David A. <da...@bo...> - 2002-12-07 14:06:32
|
David Goodger <go...@py...> writes: > Warnings mean "I'm not sure, but I think something may be wrong here" > and it's left to the user to judge. Of course I knew that. > Many of these cases are not really problems, just HTMLTidy being > overly critical. > >> In particular, I notice lots of characters which appear to be >> invalid (possibly nuls). > ... > | Character codes 128 to 159 (U+0080 to U+009F) are not allowed in HTML > > That's because test.html is encoded in UTF-8. Oh, how novel! I hadn't noticed it was using utf-8. I normally deal with handwritten HTML, so I've never seen utf-8 encoding in an HTML document before. > Looks like HTMLTidy doesn't understand the ``<?xml version="1.0" > encoding="utf-8" ?>`` processing instruction at the beginning of the > file. Try the "-utf8" option. OK. >> I'm not an HTML expert, which is why I use Tidy. Are these >> worth doing something about? > > Some are, some aren't. > > | : Doctype given is "-//W3C//DTD XHTML 1.0 Transitional//EN" > | : Document content looks like XHTML 1.0 Strict > > HTMLTidy doesn't understand the transitional DTD? Seems odd. I think it does. I think it's saying that you didn't seemt to use any transitional features... but as I said I'm a non-expert. > | test.html:16:1: Warning: <meta> element not empty or not closed > | :17:1: Warning: <meta> element not empty or not closed > > These are real errors, now corrected. As I thought. > | :23:1: Warning: <table> lacks "summary" attribute > ... > | The table summary attribute should be used to describe > | the table structure. It is very helpful for people using > | non-visual browsers. The scope and headers attributes for > | table cells are useful for specifying which headers apply > | to each table cell, enabling non-visual browsers to provide > | a meaningful context for each cell. > > The "summary" attribute is not required by the HTML 4 spec, just > recommended. I know. Tidy's warnings about this get boring. > | :85:24: Warning: <a> Anchor "table-of-contents" already defined This one worried me, but I guess it's not much of an issue. > This seems to be because both the "id" attribute of the container > element and the "name" attribute of "<a>" elements are set to the same > thing (as specified in Appendix C of the XHTML spec, > http://www.w3.org/TR/xhtml1). HTML 4 and XHTML want elements to use > the "id" attribute, but Netscape 4 only works with the "name" > attribute on "<a>" tags. Perhaps the id and name attributes ought to > be on the same element though... HTML is a mishmash; can't win. > Unless there's a problem with a real browser (not just a tool like > tidy), I don't see the need to fix this. > > Thanks for taking the time to run HTMLTidy and bring these to our > attention. No problem. -- David Abrahams da...@bo... * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution |
From: Greg W. <gw...@me...> - 2002-12-09 15:01:49
|
On 06 December 2002, David Abrahams said: > in the tools directory. It had lots of complaints, many of which look > serious, though it only flagged them as warnings. In particular, I > notice lots of characters which appear to be invalid (possibly > nuls). I'm not an HTML expert, which is why I use Tidy. Are these > worth doing something about? I've been using nsgmls for validating HTML -- it understands XML just fine, which is important because Docutils generates XHTML. I haven't tried it lately, but the last time I did (ie. after David G. fixed an error in the DTD line output), Docutils-generated HTML was just fine. Greg -- Greg Ward - software developer gw...@me... MEMS Exchange http://www.mems-exchange.org |