Hey all, I'm wondering if anyone might be able to shed some light on some strange results I'm getting while using VTD-XML on iOS. I was initially very excited with VTD's claims of performance, but in my early tests, I'm finding that the libxml based library GDataXML (from Google) is almost twice as fast. Perhaps I'm just not doing something correctly. Anyway, here is what I'm doing:
I am using Objective-C for the GDataXML library and Objective-C++ for VTD-XML.
1. For both libraries, I load a 3MB XML file from the file system into a byte array.
2. I record a starting time.
3. For VTD, I call VTDGen.setDoc with the bytes, then I call VTDGen.parse. For GDataXML, I call it's corresponding functions to load and parse the XML.
4. I record a stopping time, calculate the difference and print it out to the screen. Here's what I see:
GDataXML Parsing Time: 0.044 seconds
VTD-XML Parsing Time: 0.091 seconds
Now my first thought was that maybe VTD-XML was taking longer to parse because it had to create its index, and that when I went to run XPath queries, that it would really start to show it's speed. But I'm seeing even worse performance here. Again, here is my process:
1. For GDataXML, I run an xPath query that returns around 2,400 nodes.
2. For VTD-XML, I run the same xPath query and iterate through it. To rule out any memory copy performance hits, I'm just looping through the results only and not storing them anywhere. Here is the code I'm using:
while((index = ap.evalXPath()) != -1)
int x = 1;
Here is what I'm seeing:
GDataXML XPath Query: 0.003 seconds
VTD-XML XPath Query: 0.037 seconds
As you can see it's quite a bit slower. Any ideas out there as to why I'm seeing such a drastic difference? I've tried to make my tests as similar and simple as possible to rule out any performance hits that might be coming from memory copies or things like that. I have a few possible theories, like:
1. VTD-XML's performance claims are just plain bogus (I don't think this is true, but I'll throw it out there as one explanation). Or perhaps I just don't understand VTD-XML entirely and the performance gains are only seen when you parse a VTD indexed XML file (although this wouldn't explain the XPath performance)
2. The Apple XCode compiler isn't as good at optimizing C++ code as it is with Objective-C. I'm using the new Apple LLVM 3.0 compiler.
3. I'm not using the VTD-XML library in the most efficient way. I don't see how I could make the parsing step any more efficient (it's just 2 function calls, and I've tried the Buffer Reuse version with no noticeable improvement). But in my xPath execution and iteration, perhaps there is a more efficient way to do it?
Can anyone shed any light on these theories or offer any new ones? I'm really scratching my head here, because I feel like there must be something I'm missing to see such a drastic difference in performance. Any ideas would be greatly appreciated!
I have used VTD-XML extensively, and have always been impressed with it. Does not sound to me like you are doing anything inefficient. I have never tried GDataXML, but you have made me want to try. I will see if I can do that in the next day or so.
HI, Thanks for the post. I think that when you talk about performance it is alwys important to ask teh following questions:
1. What is being measured? how to mesure it correctly?
2 What is the best setting and how to tune the code to get teh peak performnce
3. What are the xml files in terms of size, structure and other factors?
Since it is in C, inline characteristics plays a huge role, so the make process is very important
Can you share some the details on how you benchmark results ?
I also notice that u used the c++ version of vtd-xml for benchmark, i recommend using the c version of it,
So I've gone ahead and tried using the C library instead of the C++, but I've noticed a fairly major issue— iOS doesn't appear to support thread-local storage. I need this to work multi-threaded. I'm going to look at using Boost to fix this, but perhaps you know of a better way to handle multi-threading without the use of thread-local vars?
if u dont need thread, the simplest way is to strip out all the _thread key word from the source code.. that will make the code compile