[Tclxml-users] newbie basic question using parser to extract Tag
names and characterdata together...
From: Anthony G <ant...@gm...> - 2007-01-31 23:33:09
|
Hello TCLXML experts :) I have been searching for a few days online for tutorials on how to use TCLXML and I have a very basic task. I want to be able to take an xml file say like this <MY_STRUCTURE> <TAG2> tag2 data </TAG2> <TAG3> tag3 data </TAG3> <TAG4> <TAG5> tag5 data </TAG5> <TAG6> tag6 data </TAG6> </TAG4> </MY_STRUCTURE> and export the data above into a data structure, say a list or maybe even a hastable (TCL ARRAY) for example I would want to create an array MYSTRUCTURE with the index's based on the TAGS that fall under it example set MYSTRUCTURE(TAG2) "tag2 data" set MYSTRUCTURE(TAG3) "tag3 data" set MYSTRUCTURE(TAG4,TAG5) "tag5 data" set MYSTRUCTURE(TAG4,TAG6) "tag6 data" or something like this... where the index indicates level... what I have seen so far is that I can use the xml::parser to pull out tags and data, but none at the same time... set parser [::xml::parser -characterdatacommand cdata] <--- will pull out data between tags set parser2 [::xml::parser -elementendcommand cdata2] <--- will give me names of tags after END is hit... is there a way to combine the functionality of these so that I can see what data falls under which tags? Thank you for your help and have a great day -Anthony |
From: Steve B. <Ste...@ex...> - 2007-02-01 10:52:44
|
Hi Anthony, TclXML gives you a streaming interface, which means that your application must keep track of the state of the parser as it goes through the document structure. So you have to use *both* the tags and data to make it all work together. If you're worried about making that work, then consider using TclDOM instead. With a DOM tree the context has already been built for you. If you're feeling more adventurous, you could develop a solution that uses a combination of XSLT and Tcl, based on TclXSLT. A lot of code that I write these days uses an XSL stylesheet to process the XML document, and the result of the stylesheet is a Tcl script that the Tcl application eval's. Hope that helps, Steve Ball On 01/02/2007, at 10:33 AM, Anthony G wrote: > Hello TCLXML experts :) > > I have been searching for a few days online for tutorials on how to > use TCLXML and I have a very basic task. I want to be able to take > an xml file say like this > > <MY_STRUCTURE> > <TAG2> tag2 data </TAG2> > <TAG3> tag3 data </TAG3> > <TAG4> > <TAG5> tag5 data </TAG5> > <TAG6> tag6 data </TAG6> > </TAG4> > </MY_STRUCTURE> > > and export the data above into a data structure, say a list or > maybe even a hastable (TCL ARRAY) > > for example I would want to create an array > > MYSTRUCTURE > > with the index's based on the TAGS that fall under it example > > set MYSTRUCTURE(TAG2) "tag2 data" > set MYSTRUCTURE(TAG3) "tag3 data" > set MYSTRUCTURE(TAG4,TAG5) "tag5 data" > set MYSTRUCTURE(TAG4,TAG6) "tag6 data" > > or something like this... where the index indicates level... > > what I have seen so far is that I can use the xml::parser to pull > out tags and data, but none at the same time... > > set parser [::xml::parser -characterdatacommand cdata] <--- will > pull out data between tags > set parser2 [::xml::parser -elementendcommand cdata2] <--- will > give me names of tags after END is hit... > > is there a way to combine the functionality of these so that I can > see what data falls under which tags? > > Thank you for your help and have a great day > > -Anthony > > ---------------------------------------------------------------------- > --- > Using Tomcat but need to do more? Need to support web services, > security? > Get stuff done quickly with pre-integrated technology to make your > job easier. > Download IBM WebSphere Application Server v.1.0.1 based on Apache > Geronimo > http://sel.as-us.falkag.net/sel? > cmd=lnk&kid=120709&bid=263057&dat=121642______________________________ > _________________ > Tclxml-users mailing list > Tcl...@li... > https://lists.sourceforge.net/lists/listinfo/tclxml-users |
From: Anthony G <ant...@gm...> - 2007-02-01 17:09:01
|
I am new to TCLXML and would be very greatfull for some basic examples. If I use TCLDOM, does that work on raw XML? Or does the XML need to be put into some other format? thank you in advance for your help, -Anthony On 2/1/07, Steve Ball <Ste...@ex...> wrote: > > Hi Anthony, > > TclXML gives you a streaming interface, which means that your > application must keep track of the state of the parser as it goes > through the document structure. So you have to use *both* the tags > and data to make it all work together. > > If you're worried about making that work, then consider using TclDOM > instead. With a DOM tree the context has already been built for you. > > If you're feeling more adventurous, you could develop a solution that > uses a combination of XSLT and Tcl, based on TclXSLT. A lot of code > that I write these days uses an XSL stylesheet to process the XML > document, and the result of the stylesheet is a Tcl script that the > Tcl application eval's. > > Hope that helps, > Steve Ball > > On 01/02/2007, at 10:33 AM, Anthony G wrote: > > > Hello TCLXML experts :) > > > > I have been searching for a few days online for tutorials on how to > > use TCLXML and I have a very basic task. I want to be able to take > > an xml file say like this > > > > <MY_STRUCTURE> > > <TAG2> tag2 data </TAG2> > > <TAG3> tag3 data </TAG3> > > <TAG4> > > <TAG5> tag5 data </TAG5> > > <TAG6> tag6 data </TAG6> > > </TAG4> > > </MY_STRUCTURE> > > > > and export the data above into a data structure, say a list or > > maybe even a hastable (TCL ARRAY) > > > > for example I would want to create an array > > > > MYSTRUCTURE > > > > with the index's based on the TAGS that fall under it example > > > > set MYSTRUCTURE(TAG2) "tag2 data" > > set MYSTRUCTURE(TAG3) "tag3 data" > > set MYSTRUCTURE(TAG4,TAG5) "tag5 data" > > set MYSTRUCTURE(TAG4,TAG6) "tag6 data" > > > > or something like this... where the index indicates level... > > > > what I have seen so far is that I can use the xml::parser to pull > > out tags and data, but none at the same time... > > > > set parser [::xml::parser -characterdatacommand cdata] <--- will > > pull out data between tags > > set parser2 [::xml::parser -elementendcommand cdata2] <--- will > > give me names of tags after END is hit... > > > > is there a way to combine the functionality of these so that I can > > see what data falls under which tags? > > > > Thank you for your help and have a great day > > > > -Anthony > > > > ---------------------------------------------------------------------- > > --- > > Using Tomcat but need to do more? Need to support web services, > > security? > > Get stuff done quickly with pre-integrated technology to make your > > job easier. > > Download IBM WebSphere Application Server v.1.0.1 based on Apache > > Geronimo > > http://sel.as-us.falkag.net/sel? > > cmd=lnk&kid=120709&bid=263057&dat=121642______________________________ > > _________________ > > Tclxml-users mailing list > > Tcl...@li... > > https://lists.sourceforge.net/lists/listinfo/tclxml-users > > |
From: Steve B. <Ste...@ex...> - 2007-02-01 21:27:22
|
Hi Anthony, TclDOM works on the raw XML. You build a DOM tree like this: package require dom set doc [dom::parse $xml] There are other options, like -baseuri, etc, that are useful - check the doco. DOM provides many commands and methods for navigating the tree. A particularly useful method is selectNode, which uses XPath. Thus you can find all of the descendants of the document element: set nodeList [dom::selectNode $doc /*//*] However, you don't actually want that - you want all leaf nodes: set nodeList [dom::selectNode $doc {//*[not(*)]}] "nodeList" is a static list of node tokens that you can iterate over. You can visit each node and perform some computation. Bear in mind that the node's textual content is actually in a child text node. The easiest way to get the content is by accessing the node's string value. Another tricky part is to find the array index. Fortunately, TclDOM provides the path method. We'll use that in a procedure to compute a suitable index string. proc getpath node { # Get the path back to the root node, # but exclude the root node and document element set ancestors [lrange [$node path] 2 end] # Now turn the node tokens into tag names foreach ancestor $ancestors { lappend result [$ancestor cget -nodeName] } return [join $result ,] } foreach node $nodeList { array set MYSTRUCTURE [list [getpath $node] [$node stringValue]] } Your array now has the correct entries! HTHs, Steve Ball On 02/02/2007, at 4:08 AM, Anthony G wrote: > I am new to TCLXML and would be very greatfull for some basic > examples. If I use TCLDOM, does that work on raw XML? Or does > the XML need to be put into some other format? > > thank you in advance for your help, > > -Anthony > > On 2/1/07, Steve Ball <Ste...@ex...> wrote: Hi Anthony, > > TclXML gives you a streaming interface, which means that your > application must keep track of the state of the parser as it goes > through the document structure. So you have to use *both* the tags > and data to make it all work together. > > If you're worried about making that work, then consider using TclDOM > instead. With a DOM tree the context has already been built for you. > > If you're feeling more adventurous, you could develop a solution that > uses a combination of XSLT and Tcl, based on TclXSLT. A lot of code > that I write these days uses an XSL stylesheet to process the XML > document, and the result of the stylesheet is a Tcl script that the > Tcl application eval's. > > Hope that helps, > Steve Ball > > On 01/02/2007, at 10:33 AM, Anthony G wrote: > > > Hello TCLXML experts :) > > > > I have been searching for a few days online for tutorials on how to > > use TCLXML and I have a very basic task. I want to be able to take > > an xml file say like this > > > > <MY_STRUCTURE> > > <TAG2> tag2 data </TAG2> > > <TAG3> tag3 data </TAG3> > > <TAG4> > > <TAG5> tag5 data </TAG5> > > <TAG6> tag6 data </TAG6> > > </TAG4> > > </MY_STRUCTURE> > > > > and export the data above into a data structure, say a list or > > maybe even a hastable (TCL ARRAY) > > > > for example I would want to create an array > > > > MYSTRUCTURE > > > > with the index's based on the TAGS that fall under it example > > > > set MYSTRUCTURE(TAG2) "tag2 data" > > set MYSTRUCTURE(TAG3) "tag3 data" > > set MYSTRUCTURE(TAG4,TAG5) "tag5 data" > > set MYSTRUCTURE(TAG4,TAG6) "tag6 data" > > > > or something like this... where the index indicates level... > > > > what I have seen so far is that I can use the xml::parser to pull > > out tags and data, but none at the same time... > > > > set parser [::xml::parser -characterdatacommand cdata] <--- will > > pull out data between tags > > set parser2 [::xml::parser -elementendcommand cdata2] <--- will > > give me names of tags after END is hit... > > > > is there a way to combine the functionality of these so that I can > > see what data falls under which tags? > > > > Thank you for your help and have a great day > > > > -Anthony > > > > > ---------------------------------------------------------------------- > > --- > > Using Tomcat but need to do more? Need to support web services, > > security? > > Get stuff done quickly with pre-integrated technology to make your > > job easier. > > Download IBM WebSphere Application Server v.1.0.1 based on Apache > > Geronimo > > http://sel.as-us.falkag.net/sel? > > > cmd=lnk&kid=120709&bid=263057&dat=121642______________________________ > > _________________ > > Tclxml-users mailing list > > Tcl...@li... > > https://lists.sourceforge.net/lists/listinfo/tclxml-users > > |