|
From: Steve B. <Ste...@zv...> - 2002-05-07 21:20:38
|
Larry W. Virden wrote:
> I've got the latest tclxml from the CVS repository, and installed it
> with my Tcl 8.4 environment.
>
> In the examples directory, the README says:
>
> REC-xml-20001006.xml
> The W3C XML spec in XML format. A handy file to run xmlwc over
> to test your build. You should get this output with the command
> tclsh8.3 xmlwc REC-xml-20001006.xml
> :
>
> 2929 14978 116827 REC-xml-20001006.xml
>
> However, I am seeing this:
>
> $ tclsh8.4 ./xmlwc REC-xml-20001006.xml
> 2954 14966 117568 REC-xml-20001006.xml
> $ ls -l *xml
> -rwxr-xr-x 1 lwv26 dept26 201918 Dec 28 2000 REC-xml-20001006.xml
>
> Since the numbers are differing so wildly, I thought I would ask here and
> see if anyone else had actually tried this all out.
I get the same numbers. To verify, I ran the document through
an empty XSL stylesheet. This strips the markup. Then I removed
the XML declaration, resulting in a completely plain text file.
wc produces:
[localhost:~/tclxml-main/examples] steve% wc stripped.txt
2951 17360 121252 stripped.txt
Fewer lines, but more words and characters.
I'll look into it a bit further. One thing to do is save
the text being counted by xmlwc and then compare against
the text file generated above.
BTW, I often use this program when writing papers and articles
that have a word limit, so I have an interest in making sure
that it is accurate.
Cheers,
Steve Ball
--
Steve Ball | XSLT Standard Library | Training & Seminars
Zveno Pty Ltd | Web Tcl Complete | XML XSL Schemas
http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development
Ste...@zv... +---------------------------+---------------------
Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099
|