From: Ildar <ild...@gm...> - 2006-09-28 06:15:17
|
hi all, I'm new here and might miss the page that describes what I need. Please, send me the link to the page this case. I'm playing with the following XML. <?xml version="1.0" encoding="utf-8"?> <root> <level1> <level2-1>hello</level2-1> </level1> </root> I need to get the content of /root/level1/level2-1. My code is as follows: wxXml2Document doc; wxString err; doc.Load(wxT("****.xml"), &err); if (doc.IsOk()) { wxXml2Node node = doc.GetRoot(); wxString s = node.Find(wxT("level2-1")).GetName(); wxString s1 = node.Find(wxT("level2-1")).GetContent(); wxString s2 = node.Find(wxT("level2-1")).GetFirstChild().GetContent(); } the results are: s == "level2-1" // the same as expected s1 == "" // I expected to get "hello" s2 == "hello" // ? The way I'm accessing the value is quite weird and doesn't seem to be safe... Is there an easier way? Any links/examples/ideas are welcome. Thank you in advance. -- Ildar |
From: Francesco M. <f18...@ya...> - 2006-09-28 13:03:48
|
Ildar ha scritto: > hi all, > > I'm new here and might miss the page that describes what I need. Please, send > me the link to the page this case. > > > I'm playing with the following XML. > > <?xml version="1.0" encoding="utf-8"?> > <root> > <level1> > <level2-1>hello</level2-1> > </level1> > </root> > > I need to get the content of /root/level1/level2-1. > My code is as follows: > > wxXml2Document doc; > wxString err; > > doc.Load(wxT("****.xml"), &err); > > if (doc.IsOk()) > { > wxXml2Node node = doc.GetRoot(); > wxString s = node.Find(wxT("level2-1")).GetName(); > wxString s1 = node.Find(wxT("level2-1")).GetContent(); > wxString s2 = node.Find(wxT("level2-1")).GetFirstChild().GetContent(); > } > > the results are: > > s == "level2-1" // the same as expected > s1 == "" // I expected to get "hello" > s2 == "hello" // ? yes, sure. no bugs. This is really a FAQ and I should add it somewhere. The results you get are because your XML tree is parsed as: wxXML_ELEMENT_NODE with name "root" and content="" |- wxXML_ELEMENT_NODE with name "level1" and content="" |- wxXML_ELEMENT_NODE with name "level2-1" and content="" |- wxXML_TEXT_NODE with name "" and content="hello" Thus, if the node is an element node, you shouldn't look at its contents but rather at its children. If the node is a text node, the name is empty (or maybe fixed to "text", I don't remember) and you should just look at his contents. I've added in CVS the GetNodeContent() function which simplifies the process (see below). > The way I'm accessing the value is quite weird and doesn't seem to be safe... > > Is there an easier way? wxXml2Node child = doc.GetRoot(); while (child != wxXml2EmptyNode) { if (child.GetName() == wxT("level1")) { // process text enclosed by <level1> wxString content = child->GetNodeContent(); } else if (child->GetName() == wxT("tag2")) { // process tag2 ... } else { // unknown tag? } child = child.GetNext(); } this pattern allows you to scan each node of the XML and e.g. catch structural errors (e.g. you can check for unknown tags). Francesco |
From: Ildar <ild...@gm...> - 2006-09-28 15:26:18
|
Francesco, Thank you, I got the idea. However, 2 more questions: #1 wxString s3 = node.Find(wxT("level2-1")).GetFirstChild().GetName(); result: s3 == "text" // as you mentioned but wxString s4 = node.Find(wxT("level2-1")).Get(wxT("text")); result: s4 == "" shouldn't it return "hello"? #2 is the following possible? |- wxXML_ELEMENT_NODE with name "level2-1" and content="" |- wxXML_TEXT_NODE with name "" and content="hello" |- wxXML_TEXT_NODE with name "" and content="bye" wxXML_ELEMENT_NODE - contains 2 text nodes. |
From: Francesco M. <f18...@ya...> - 2006-09-28 16:03:04
|
Ildar ha scritto: > Francesco, > > Thank you, I got the idea. > > However, 2 more questions: > > #1 > > wxString s3 = node.Find(wxT("level2-1")).GetFirstChild().GetName(); > > result: > s3 == "text" // as you mentioned > > but > > wxString s4 = node.Find(wxT("level2-1")).Get(wxT("text")); > > result: > s4 == "" wxXml2Node::Get returns a wxXml2Node, not a wxString. I don't know why the compiler accepts the assignment of a wxXml2Node to a wxString... couldn't you trace it with a debugger and see what happens? > #2 > > is the following possible? > > |- wxXML_ELEMENT_NODE with name "level2-1" and content="" > |- wxXML_TEXT_NODE with name "" and content="hello" > |- wxXML_TEXT_NODE with name "" and content="bye" > > wxXML_ELEMENT_NODE - contains 2 text nodes. sure; it's possible. IIRC there are some rules which makes libxml2 split a text into two text nodes but I don't remember them right now. However consider also the case of this XML fragment: <node> hello <b>XML</b> world! </node> the "node" node has two text nodes interleaved by an element node (with name="b" and with a single text node as child). Francesco |
From: Ildar <ild...@gm...> - 2006-09-29 14:17:54
|
> > However, 2 more questions: > > > > #1 > > > > wxString s3 = node.Find(wxT("level2-1")).GetFirstChild().GetName(); > > > > result: > > s3 == "text" // as you mentioned > > > > but > > > > wxString s4 = node.Find(wxT("level2-1")).Get(wxT("text")); > > > > result: > > s4 == "" > wxXml2Node::Get returns a wxXml2Node, not a wxString. I don't know why > the compiler accepts the assignment of a wxXml2Node to a wxString... > couldn't you trace it with a debugger and see what happens? Sorry, the code is as follows wxString s4 = node.Find(wxT("level2-1")).Get(wxT("text")).GetContent(); Btw, wxString s5 = node.Find(wxT("level2-1")).Get(wxT("text")).GetName(); result: s5 == "" |
From: Francesco M. <f18...@ya...> - 2006-09-29 23:30:20
|
Ildar ha scritto: >>> However, 2 more questions: >>> >>> #1 >>> >>> wxString s3 = node.Find(wxT("level2-1")).GetFirstChild().GetName(); >>> >>> result: >>> s3 == "text" // as you mentioned >>> >>> but >>> >>> wxString s4 = node.Find(wxT("level2-1")).Get(wxT("text")); >>> >>> result: >>> s4 == "" >> wxXml2Node::Get returns a wxXml2Node, not a wxString. I don't know why >> the compiler accepts the assignment of a wxXml2Node to a wxString... >> couldn't you trace it with a debugger and see what happens? > > Sorry, the code is as follows > > wxString s4 = node.Find(wxT("level2-1")).Get(wxT("text")).GetContent(); > > Btw, > > wxString s5 = node.Find(wxT("level2-1")).Get(wxT("text")).GetName(); It looks like Get() does not find that node. Can you trace it with a debugger ? It should be easy to understand wxXml2 source code and find what's wrong... (even if I suspect that's not really a bug). Thanks, Francesco |
From: Ildar <ild...@gm...> - 2006-09-28 15:37:35
|
and one more ;) is there a way to get a count of nodes? e.g. - Find(...) allows to get the N-th node which meets the condition, but what's the number of nodes which meet the condition? - or how to get the number of wxXML_ELEMENT_NODE subnodes of wxXML_ELEMENT_NODE node? I understand I can enumerate the nodes and count them, is it the only possible solution? |
From: Francesco M. <f18...@ya...> - 2006-09-28 16:13:06
|
Ildar ha scritto: > and one more ;) > > is there a way to get a count of nodes? currently no; that's because wxXml2 API derives from wxXml API because when I wrote it I wanted to port some wxXml code to a newer API with DTD validation and thus I made wxXml2 API mostly compatible with wxXml one, which has no GetChildCount() methods. However I agree this would be useful to have. Patches welcome :D > e.g. > > - Find(...) allows to get the N-th node which meets the condition, but what's > the number of nodes which meet the condition? we could add a new GetChildrenCount(const wxXmlNode &tofind, bool bNS = TRUE, bool recurse = TRUE) function to return it > - or how to get the number of wxXML_ELEMENT_NODE subnodes of wxXML_ELEMENT_NODE > node? I would add a GetChildrenCount() const function for this. > I understand I can enumerate the nodes and count them, is it the only possible > solution? it highly depends on how to you want to organize your parser. For me, the pattern I copied & pasted in previous mail has always been a winning pattern ;) I.e. you scan all the XML document, handle unknown tags in the else{} part, and call subroutines to load other nodes in other elseif{} branches. It's a lot of time I don't look at XML Schemas, SAX parser, etc. You can get an idea of the "low-level" libxml2 API looking at http://xmlsoft.org/html/index.html Francesco |
From: Armel A. <ass...@wa...> - 2006-09-28 15:38:36
|
>> the results are: >> >> s == "level2-1" // the same as expected >> s1 == "" // I expected to get "hello" >> s2 == "hello" // ? > yes, sure. no bugs. > This is really a FAQ and I should add it somewhere. > > The results you get are because your XML tree is parsed as: nonetheless, this is not really XSL-like (which Ildar expected). It could be cool to have a GetAllText ( ) all the simple text of all sub nodes concatenated. Armel |
From: Francesco M. <f18...@ya...> - 2006-09-28 15:55:25
|
Armel Asselin ha scritto: >>> the results are: >>> >>> s == "level2-1" // the same as expected >>> s1 == "" // I expected to get "hello" >>> s2 == "hello" // ? >> yes, sure. no bugs. >> This is really a FAQ and I should add it somewhere. >> >> The results you get are because your XML tree is parsed as: > nonetheless, this is not really XSL-like (which Ildar expected). It could be > cool to have a GetAllText ( ) all the simple text of all sub nodes > concatenated. I just added to CVS the GetNodeContent() functions which does exactly that. Francesco |
From: Armel A. <ass...@wa...> - 2006-09-28 15:58:54
|
>> nonetheless, this is not really XSL-like (which Ildar expected). It could >> be >> cool to have a GetAllText ( ) all the simple text of all sub nodes >> concatenated. > I just added to CVS the GetNodeContent() functions which does exactly > that. cool :) Armel |
From: Ildar <ild...@gm...> - 2006-09-30 07:36:22
|
> I just added to CVS the GetNodeContent() functions which does exactly that. what do you think of adding "recursive" param to the function? e.g. GetNodeContent(true) would work as it works now - it would return a concatenation string of all sub-nodes. GetNodeContent(false) would return a concatenation string of wxXML_TEXT_NODE nodes which belong to current wxXML_ELEMENT_NODE |