From: Steve B. <Ste...@zv...> - 2002-05-07 21:20:38
|
Larry W. Virden wrote: > I've got the latest tclxml from the CVS repository, and installed it > with my Tcl 8.4 environment. > > In the examples directory, the README says: > > REC-xml-20001006.xml > The W3C XML spec in XML format. A handy file to run xmlwc over > to test your build. You should get this output with the command > tclsh8.3 xmlwc REC-xml-20001006.xml > : > > 2929 14978 116827 REC-xml-20001006.xml > > However, I am seeing this: > > $ tclsh8.4 ./xmlwc REC-xml-20001006.xml > 2954 14966 117568 REC-xml-20001006.xml > $ ls -l *xml > -rwxr-xr-x 1 lwv26 dept26 201918 Dec 28 2000 REC-xml-20001006.xml > > Since the numbers are differing so wildly, I thought I would ask here and > see if anyone else had actually tried this all out. I get the same numbers. To verify, I ran the document through an empty XSL stylesheet. This strips the markup. Then I removed the XML declaration, resulting in a completely plain text file. wc produces: [localhost:~/tclxml-main/examples] steve% wc stripped.txt 2951 17360 121252 stripped.txt Fewer lines, but more words and characters. I'll look into it a bit further. One thing to do is save the text being counted by xmlwc and then compare against the text file generated above. BTW, I often use this program when writing papers and articles that have a word limit, so I have an interest in making sure that it is accurate. Cheers, Steve Ball -- Steve Ball | XSLT Standard Library | Training & Seminars Zveno Pty Ltd | Web Tcl Complete | XML XSL Schemas http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development Ste...@zv... +---------------------------+--------------------- Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099 |