From: Ted N. S. A. GA <te...@ag...> - 2002-12-10 18:30:22
|
In message <3DF...@pr...>you write: > >Hi all, > >I checked out a fresh copy of tclxml (version 2.5 released today), >and found that most (all?) of my changes already were applied >(by Steve Ball presumably). I've been running my patched tclxml >daily for a year, with a lot of chopped off xml, and that works fine. >However, the cvs version does not work as previously noted. >There are some more changes to the cvs version compared to my version. >The main difference seems to be in sgmlparser, and after some >search I found the code: > # Patch from bug report #596959, Marshall Rose >to be the reason. >Just do: > if {0} { > # Patch from bug report #596959, Marshall Rose > if {[string compare [lindex $sgml 4] ""]} { > set sgml [linsert $sgml 0 {} {} {} {} {}] > } > } > >I don't know the reason for this code. This must be sorted out >by the author. This would never have happened if there were a test >case with -final 0 and chopped off xml. So, please someone, >add this so we wont see bugs like this again. > >There remains a question of how to actually use the -final option. >I use it always when there is a risc of incomplete xml. >Is there any reason for using -final 1? What's the advantage? > >In case there are remaining problems you can always extract >my patched TclXML form my whiteboard application. >See "http://hem.fyristorg.com/matben/" > >Best Wishes, Mats Mats, Thanks for the input. I applied your change to dike out the Rose patch from 2.5 (though I do find the fact that Marshall Rose is using tclxml a nice endorsement..) It does not crash any longer, but I can't say that I consider the result correct. It's not what I expect at any rate.. Here's the test case with -final 1 (default) #### package require xml proc cdata {data args} { puts $data } set parser [::xml::parser parseit \ -characterdatacommand cdata ] $parser parse "<the>world</the>" #### It outputs: solabel10% ./doit | od -c 0000000 w o r l d \n 0000006 This makes sense to me. Here's the test case with -final 0.. #### package require xml proc cdata {data args} { puts $data } set parser [::xml::parser parseit \ -characterdatacommand cdata -final 0 ] $parser parse "<the>" $parser parse "world" $parser parse "</the>" $parser configure -final 1 #### It outputs: solabel10% ./doit2 | od -c 0000000 \n w o r l d \n \n 0000010 Note the extraneous leading newline, and the extraneous trailing one (one comes from the "puts" of course..) Ted |
From: Ted N. S. A. GA <te...@ag...> - 2003-02-04 19:03:58
|
Hello folks, You may recall that back in December, I had found a bug in using the pure Tcl version of tclxml to parse XML streams with the "-final 0" option. I think someone suggested diking out the Rose patch for bug #596959, but there were others saying that would break something else, and there was no real resolution to the issue. I recently had a rare spare day to look at it again. I certainly can't claim to understand the code which uses regular expressions in a much more heavy duty fashion than anything else I have seen in Tcl. However, after hours of playing around with "puts", I was drawn back to the Rose patch. It seems to be intended to insert a null XML element into the stream for some reason, but I question the way it does this. The tokenized xml seems to be put into a list in 4-tuples, as in the loop at line 332 of sgmlparser.tcl: foreach {tag close param text} $sgml The rose patch on line 175 of sgmlparse.tcl inserts -5- empty tuples into the tokenized xml: set sgml [linsert $sgml 0 {} {} {} {} {}] This pushes what was a "tag" into being a "close". Should this not be -4- empty tuples? If I apply the following patch: -----CUT HERE----- *** sgmlparser.tcl.bak Tue Feb 4 13:34:40 2003 --- sgmlparser.tcl Tue Feb 4 13:40:16 2003 *************** *** 172,178 **** # Patch from bug report #596959, Marshall Rose if {[string compare [lindex $sgml 4] ""]} { ! set sgml [linsert $sgml 0 {} {} {} {} {}] } } else { --- 172,178 ---- # Patch from bug report #596959, Marshall Rose if {[string compare [lindex $sgml 4] ""]} { ! set sgml [linsert $sgml 0 {} {} {} {} ] } } else { -----CUT HERE--- Then my test program seems to run OK. Comments? Ted PS: Here's my test program. Run it with no args to do a piecemeal parse with -final 0. Run it with 1 arg to do a all at once parse with -final 0, and run it with 2 args to do an all at once parse with -final 1. Before the "patch", case 1 will error out, case 2 will produce no output and case 3 will work OK. After the patch, all 3 cases produce the same output. ----CUT HERE---- #!/usr/local/bin/tclsh8.4 package require xml proc xml_el_start {name attrs args} { puts "Start name ($name) attrs ($attrs) args ($args)" } proc xml_el_end {name args} { puts "End name ($name) args ($args)" } proc xml_char_data {data} { if { [string length $data] } { puts "Cdata data ($data)" } } set parser [ ::xml::parser \ -elementstartcommand xml_el_start \ -elementendcommand xml_el_end \ -characterdatacommand xml_char_data \ -defaultcommand xml_default \ -final 0 \ ] if { [llength $argv] == 0 } { $parser parse "<fooby>" $parser parse "<hello>" $parser parse "world" $parser parse "</hello>" $parser parse "</fooby>" } else { if { [llength $argv] == 2} { $parser configure -final 1 } $parser parse {<fooby><hello>world</hello></fooby>} } ----CUT HERE--- |
From: Steve B. <Ste...@zv...> - 2003-02-04 20:50:43
|
Ted Nolan SRI Augusta GA wrote: > You may recall that back in December, I had found a bug in using the > pure Tcl version of tclxml to parse XML streams with the "-final 0" option. [...snip...] > The tokenized xml seems to be put into a list in 4-tuples, as in the > loop at line 332 of sgmlparser.tcl: > > foreach {tag close param text} $sgml > > The rose patch on line 175 of sgmlparse.tcl inserts -5- empty tuples > into the tokenized xml: > > set sgml [linsert $sgml 0 {} {} {} {} {}] > > This pushes what was a "tag" into being a "close". > > Should this not be -4- empty tuples? > > If I apply the following patch: [...snip...] The thing to do at this stage is get the test suite setup correctly. I'll look at adding your test script to the suite today, and run the tests against the current version of the parser and then with your patch applied. This should get into the v2.6 release. Cheers, Steve Ball -- Steve Ball | XSLT Standard Library | Training & Seminars Zveno Pty Ltd | Web Tcl Complete | XML XSL Schemas http://www.zveno.com/ | TclXML TclDOM | Tcl, Web Development Ste...@zv... +---------------------------+--------------------- Ph. +61 2 6242 4099 | Mobile (0413) 594 462 | Fax +61 2 6242 4099 |