From: Mats B. <ma...@pr...> - 2003-02-06 07:43:18
|
Ted Nolan wrote: > You may recall that back in December, I had found a bug in using the > pure Tcl version of tclxml to parse XML streams with the "-final 0" option. > > I think someone suggested diking out the Rose patch for bug #596959, but > there were others saying that would break something else, and there was > no real resolution to the issue. I believe that this patch is not correct, but I don't want to remove it from cvs since this is bad practise. Look at the parts of code tagged with "Mats" where I cache unmatched xml to be prepended to the next xml chunk. This also works for arbitrary chopped off xml: <stream><|...|junk var= 'u|...|ndef'/> ... you get it. # This RE only fixes chopped inside tags, not chopped text. if {[regexp {^([^<]*)(<[^>]*$)} [lindex $sgml end] x text rest]} { set sgml [lreplace $sgml end end $text] # Mats: unmatched stuff means that it is chopped off. Cache it for next round. set state(leftover) $rest } You may check out my copy of TclXML included in my application at: "http://coccinella.sourceforge.net" One additional complication may arise if you get xml chunks from events (socket) and there are any update in callbacks that triggers a new parse operation. The xml parser is, of course, not reentrant, but serialization must be enforced. See code below: proc wrapper::parsereentrant {p xml} { variable refcount variable stack incr refcount if {$refcount == 1} { # This is the main entry: do parse original xml. $p parse $xml # Parse everything on the stack (until empty?). while {[string length $stack] > 0} { set tmpstack $stack set stack "" $p parse $tmpstack } } else { # Reentry, put on stack for delayed execution. append stack $xml } incr refcount -1 return {} } Mats |