From: Igor L. S. <ism...@st...> - 2004-10-18 12:18:26
|
Hello libxmlplusplus-general, I've run into an odd problem. I'm working with libxml++-1.0.4 and I need to load an XML document substituting all external entities. Looking at the example in /dom_parse_entities I noticed that if I do parser.set_substitute_entities(true), the program crashes. I wrote the following small function using only libxml2 to load a document with entities substition: void parse_file_test(const std::string& filename) { xmlParserCtxtPtr ctxt; xmlDocPtr doc; ctxt = xmlNewParserCtxt(); if (ctxt == NULL) { std::cerr << "Failed to allocate parser context" << std::endl; return; } doc = xmlCtxtReadFile(ctxt, filename.c_str(), NULL, XML_PARSE_DTDVALID | XML_PARSE_NOENT ); if (doc == NULL) { std::cerr << "Failed to parse " << filename << std::endl; } else { if (ctxt->valid == 0) { std::cerr << "Failed to validate " << filename << std::endl; } else { FILE* f = NULL; f = fopen("res.xml", "w"); if (f) { xmlDocDump(f, doc); fclose(f); } } xmlFreeDoc(doc); } xmlFreeParserCtxt(ctxt); } and it worked just fine. I then added this very function to DomParser as a method DomParser::parse_file_test() and rebuilt libxml++. I then called this method from a small program which used libxml++ and it crashed: #0 0x00000019 in ?? () #1 0x0806a320 in xmlFreeNodeList (cur=0x82311a0) at tree.c:3282 #2 0x0804f1a6 in xmlFreeEntity (entity=0x821ffe8) at entities.c:72 #3 0x0806dff0 in xmlHashFree (table=0x821e7d0, f=0x804fb2c <xmlFreeEntityWrapper>) at hash.c:284 #4 0x0804fb53 in xmlFreeEntitiesTable (table=0x821e7d0) at entities.c:690 #5 0x08067f1c in xmlFreeDtd (cur=0x821e6e0) at tree.c:1023 #6 0x080680d4 in xmlFreeDoc (cur=0x821e570) at tree.c:1114 #7 0x08049bae in parse_file_test(std::string const&) (filename=@0xbffff7a0) at domparser.cc:277 #8 0x08048577 in main (argc=1, argv=0xbffff824) at main.cc:84 #9 0x081202e6 in __libc_start_main () Went into xmlFreeDoc never to return... I then declared the method static in the DomParser class so as to exclude object instantiation. The program crashed all the same. Finally I moved parse_file_test() out of the DomParser class and declared it as a _function_. Didn't help - the program would still crash. Coming back to that libxml2 test that did work - I recompiled it with parse_file_test() in a small library and it worked just fine. It looks like the problem only arises when libxml++ is involved. So what am I doing wrong? My goal is to load an XML document with DTD validation and substitution of external entities. Thank you. -- Best regards, Igor mailto:ism...@st... |
From: Christophe de V. <cde...@al...> - 2004-10-18 13:29:14
|
Hi Igor, Igor L. Smolovski wrote: >[...] > >I then added this very function to DomParser as a method DomParser::parse_file_test() and rebuilt libxml++. >I then called this method from a small program which used libxml++ and it crashed: > >#0 0x00000019 in ?? () >#1 0x0806a320 in xmlFreeNodeList (cur=0x82311a0) at tree.c:3282 >#2 0x0804f1a6 in xmlFreeEntity (entity=0x821ffe8) at entities.c:72 >#3 0x0806dff0 in xmlHashFree (table=0x821e7d0, f=0x804fb2c <xmlFreeEntityWrapper>) at hash.c:284 >#4 0x0804fb53 in xmlFreeEntitiesTable (table=0x821e7d0) at entities.c:690 >#5 0x08067f1c in xmlFreeDtd (cur=0x821e6e0) at tree.c:1023 >#6 0x080680d4 in xmlFreeDoc (cur=0x821e570) at tree.c:1114 >#7 0x08049bae in parse_file_test(std::string const&) (filename=@0xbffff7a0) at domparser.cc:277 >#8 0x08048577 in main (argc=1, argv=0xbffff824) at main.cc:84 >#9 0x081202e6 in __libc_start_main () > >Went into xmlFreeDoc never to return... > > Could you post the smallest program possible that reproduce the problem (possibly the one mentionned above) ? This would save much time for understanding what's going wrong. Regards, Christophe |
From: Igor L. S. <ism...@st...> - 2004-10-18 17:30:32
|
Hello Christophe, Monday, October 18, 2004, 5:29:04 PM, you wrote: CdV> Could you post the smallest program possible that reproduce the problem CdV> (possibly the one mentionned above) ? This would save much time for CdV> understanding what's going wrong. Here's the libxml2 test program that WORKS: ===test1.cpp============= BEGIN ======================================================== #include <libxml/parser.h> #include <libxml/tree.h> #include <string> #include <iostream> using namespace std; void parse_file_test(const std::string& filename); void parse_file_test(const std::string& filename) { xmlParserCtxtPtr ctxt; xmlDocPtr doc; ctxt = xmlNewParserCtxt(); if (ctxt == NULL) { std::cerr << "Failed to allocate parser context" << std::endl; return; } doc = xmlCtxtReadFile(ctxt, filename.c_str(), NULL, XML_PARSE_DTDVALID | XML_PARSE_NOENT ); if (doc == NULL) { std::cerr << "Failed to parse " << filename << std::endl; } else { if (ctxt->valid == 0) { std::cerr << "Failed to validate " << filename << std::endl; } else { FILE* f = NULL; f = fopen("res.xml", "w"); if (f) { xmlDocDump(f, doc); fclose(f); } } xmlFreeDoc(doc); } xmlFreeParserCtxt(ctxt); } int main(int _argc, char** _argv) { if (_argc < 2) { cerr << "File name missing" << endl; return 1; } string filename(_argv[1]); parse_file_test(filename); return 0; } ===test1.cpp============= END ======================================================== I then take the function parse_file_test() and copy-paste it to parsers/domparser.cc and declare it in parsers/domparser.h outside the xmlpp namespace like this: ... #include <libxml++/api_export.h> extern void parse_file_test(const std::string& filename); namespace xmlpp { ... And here's the test program I use to test parse_file_test() in libxml++: ===test2.cpp============= BEGIN ======================================================== #include <libxml++/libxml++.h> #include <iostream> int main(int argc, char* argv[]) { std::string filepath; if(argc > 1 ) filepath = argv[1]; //Allow the user to specify a different XML file to parse. else filepath = "example.xml"; try { parse_file_test(filepath); } catch(const std::exception& ex) { std::cout << "Exception caught: " << ex.what() << std::endl; } return 0; } ===test2.cpp============= END ======================================================== And here's the XML files I use as input: =====example.xml==============BEGIN========================================= <?xml version="1.0"?> <!DOCTYPE example PUBLIC "" "example.dtd" [ <!ENTITY wwwmurrayc SYSTEM "entity.xml"> <!ENTITY wwwlibxmlplusplus "http://libxmlplusplus.sourceforge.net"> ]> <example> <examplechild id="1"> <child_of_child> &wwwmurrayc; </child_of_child> </examplechild> </example> ======example.xml=============END=========================================== ======entity.xml============BEGIN=========================================== Hello! ======entity.xml============END============================================= test1.cpp runs just fine - it produces an xml file res.xml as follows: <?xml version="1.0"?> <!DOCTYPE example PUBLIC "" "example.dtd" [ <!ENTITY wwwmurrayc SYSTEM "entity.xml"> <!ENTITY wwwlibxmlplusplus "http://libxmlplusplus.sourceforge.net"> ]> <example> <examplechild id="1"> <child_of_child> Hello! </child_of_child> </examplechild> </example> Alas, test2.cpp crashes. When I remove xmlFreeDoc(doc) from parse_file_test() in libxml++, it also works just fine. Otherwise it crashes: (gdb) bt #0 0x00000011 in ?? () #1 0x0806d184 in xmlFreeNodeList (cur=0x812b070) at tree.c:3282 #2 0x0806d057 in xmlFreeNodeList (cur=0x812a4a8) at tree.c:3286 #3 0x0806d057 in xmlFreeNodeList (cur=0x8129f38) at tree.c:3286 #4 0x0806b05d in xmlFreeDoc (cur=0x8117348) at tree.c:1126 #5 0x0804c9ae in parse_file_test(std::string const&) (filename=@0xbffff7b0) at domparser.cc:277 #6 0x0804b52f in main () #7 0x420158d4 in __libc_start_main () from /lib/i686/libc.so.6 N.B. domparser.cc:277 - this is where xmlFreeDoc(doc) of parse_file_test() is in my case. So what's happening? Am I doing something wrong? -- Best regards, Igor mailto:ism...@st... |
From: Christophe de V. <cde...@al...> - 2004-10-19 08:28:24
|
Hi, I could reproduce the problem by just setting substitute_entities to true. From what I can see, the problem does not come from how you use the library. libxml++ use the _private field of xmlNode to store pointers to children of xmlpp::Node. When activating entities substitution, libxml2 seems not to keep _private value consistent enough for libxml++, leading to objects deleted twice (which cause the segmentation fault you had). I'll try to produce a small pure libxml2 sample which shows the problem, do we can discuss (and hopefully solve) it on the libxml mailing-list. Regards, Christophe |
From: Daniel V. <vei...@re...> - 2004-10-19 08:41:20
|
On Tue, Oct 19, 2004 at 10:28:13AM +0200, Christophe de VIENNE wrote: > I'll try to produce a small pure libxml2 sample which shows the problem, > do we can discuss (and hopefully solve) it on the libxml mailing-list. yes, please bugzilla, preferably after checking out with the current CVS version. I may try to do a release at the end of the week, Daniel -- Daniel Veillard | Red Hat Desktop team http://redhat.com/ vei...@re... | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ |
From: Christophe de V. <cde...@al...> - 2004-10-19 12:08:40
|
Daniel Veillard wrote: >On Tue, Oct 19, 2004 at 10:28:13AM +0200, Christophe de VIENNE wrote: > > >>I'll try to produce a small pure libxml2 sample which shows the problem, >>do we can discuss (and hopefully solve) it on the libxml mailing-list. >> >> > yes, please bugzilla, preferably after checking out with the current >CVS version. > Done. I joined a little example that shows the problem (but I don't think it's output will help a lot to solve it). Christophe |
From: Igor L. S. <ism...@st...> - 2004-10-19 09:13:03
|
Hello Christophe, Tuesday, October 19, 2004, 12:28:13 PM, you wrote: CdV> Hi, CdV> I could reproduce the problem by just setting substitute_entities to true. CdV> From what I can see, the problem does not come from how you use the CdV> library. Yes, indeed this problem can be triggered simply by setting substitute_entities to true in ../examples/dom_entities/main.cc. However, what I actually did was add a function which loads an XML document with entities substitution with the help of libxml2 to libxml++ and call this function from my own program without actually using any of libxml++ methods. I did not create any libxml++ objects at all. That very same function when placed into a file of its own or in the same file with main() works just fine. CdV> libxml++ use the _private field of xmlNode to store pointers to children CdV> of xmlpp::Node. CdV> When activating entities substitution, libxml2 seems not to keep CdV> _private value consistent enough for libxml++, leading to objects CdV> deleted twice (which cause the segmentation fault you had). Thing is I tried to load a doc by means of pure libxml2 without any libxml++ involvement - what I did was simply place that function inside libxml++. Just a function, not even a method of some libxml++ object or anything. The reason I did so is this - I had come across that problem with setting substitute_entities to true in libxml++ and started to look for a quick solution - it soon turned out that pure libxml2 did the trick just fine but since I was using libxml++ in my project I thought I'd integrate it with libxml++. My attempts at integration failed and since I was at the end of my tether I thought I'd just do a small check - add a piece of code (which I knew worked fine on its own) to the library without really plugging it into its internals in any way. That code doesn't rely on anything inside libxml++, you may call it a 'foreign object' if you like. Naturally one would think that in a situation like that there'd be no problems of this kind and yet there is. -- Best regards, Igor mailto:ism...@st... |
From: Christophe de V. <cde...@al...> - 2004-10-19 09:30:08
|
Hi Igor, Igor L. Smolovski wrote: >CdV> libxml++ use the _private field of xmlNode to store pointers to children >CdV> of xmlpp::Node. >CdV> When activating entities substitution, libxml2 seems not to keep >CdV> _private value consistent enough for libxml++, leading to objects >CdV> deleted twice (which cause the segmentation fault you had). >Thing is I tried to load a doc by means of pure libxml2 without any >libxml++ involvement - what I did was simply place that function inside libxml++. >Just a function, not even a method of some libxml++ object or anything. > > The thing is that libxml++ register itself to libxml2 so at each node creation/destruction it's notified, and automaticaly instanciate an libxml++ object or destroy it. Regards, Christophe |
From: Igor L. S. <ism...@st...> - 2004-10-19 10:57:10
|
Hello Christophe, Tuesday, October 19, 2004, 1:29:55 PM, you wrote: CdV> The thing is that libxml++ register itself to libxml2 so at each node CdV> creation/destruction it's notified, and automaticaly instanciate an CdV> libxml++ object or destroy it. Thing is I did not instantiate any libxml++ object in my program. The only umbelical cord that connects me to libxml++ in my program is the following line (apart from the actual call to my function): #include <libxml++/libxml++.h> Am I correct in saying that somewhere in the header files included via libxml++.h a libxml++ object, which registers itself to libxml2, is actually instantiated? (If yes, could you possibly direct me to that header file and object?) I've always thought that libxml++ becomes directly involved only when I actually instantiate an object in my program, for instance xmlpp::DocParser, etc. -- Best regards, Igor mailto:ism...@st... |
From: Igor L. S. <ism...@st...> - 2004-10-19 09:19:10
|
Hello Christophe, Tuesday, October 19, 2004, 12:28:13 PM, you wrote: ILS> Yes, indeed this problem can be triggered simply by setting ILS> substitute_entities to true in ../examples/dom_entities/main.cc. Sorry, I meant ../examples/dom_parse_entities/main.cc. -- Best regards, Igor mailto:ism...@st... |
From: Christophe de V. <cde...@al...> - 2004-10-19 14:51:28
|
Hi Igor, Igor L. Smolovski wrote: >Thing is I did not instantiate any libxml++ object in my program. The >only umbelical cord that connects me to libxml++ in my program is the >following line (apart from the actual call to my function): > >#include <libxml++/libxml++.h> > > You also link to libxml++.so, which statically instanciate a class of type Document::Init (cf document.(h|cc) ). In it's constructor it register the callbacks. BTW, I realised the problems we encounter with static instanciation and compiler other than gcc can be solver using the nifty counter idiom. I'm tempted to do it for next stable release, although I'll have to make sure it does not break ABI. >Am I correct in saying that somewhere in the header files included via >libxml++.h a libxml++ object, which registers itself to libxml2, is actually instantiated? >(If yes, could you possibly direct me to that header file and object?) >I've always thought that libxml++ becomes directly involved only when I actually instantiate an object >in my program, for instance xmlpp::DocParser, etc. > > Just linking to libxml++ is enough. |
From: Igor S. <smo...@fa...> - 2004-10-20 07:07:13
|
Hello Christophe, Tuesday, October 19, 2004, 6:51:15 PM, you wrote: CdV> Hi Igor, CdV> Igor L. Smolovski wrote: >>Thing is I did not instantiate any libxml++ object in my program. The >>only umbelical cord that connects me to libxml++ in my program is the >>following line (apart from the actual call to my function): >> >>#include <libxml++/libxml++.h> >> >> CdV> You also link to libxml++.so, which statically instanciate a class of CdV> type Document::Init (cf document.(h|cc) ). In it's constructor it CdV> register the callbacks. Thanks a lot. I see it clearly now :-) -- Best regards, Igor mailto:smo...@fa... |