Martin Probst <mail@martin-probst.com> wrote:

A stream oriented XPath execution could work around this, though
stream based engines are significantly less performant as far as I know.

Stream oriented, you mean "SAX like"? Don't constructing tree?

which stream oriented XPath engine do you recommend me?

But there is also one this - with Saxon people are able to use XQueries, so I'am able to implement XQueryService, which you think it

BTW, Xindice server use 128MB memory and it is unable to hold 5MB document!


One major performance issue you might run into with your implementation
strategy is the time needed to construct the documents back from the
CLOBs in your database.

Document is contructed as a String. MySQL have reputancy (repucy?) of fast SQL server, I/O should be fast, BUT THIS IS IMPORTANT:

my implementation WITH EVERY QUERY CONSTRUCT IT'S TREE, that is wrong with myx performance, BUT:

1. if you produce 5MB file into tree and get 80MB of data (which is seems to be when you use DOM) and put 80MB data on hard disc isn't this also performance issue?!?

THERE IS SEVERAL WAY HOW TO A LOT OF CONSTRUCTING TREE'S!

I can implement some kind of "cashing" for Resource. When I make instance of myx XPathQueryService and provide its xmldb.Collection I can use some amount of memory - load once, query many time!

I don't know exactly about your implementation
but if you have to reparse and load the document often this will be
really slow with Saxon (especially as he uses fixed size arrays which he
has to reallocate if their getting over-full).

I did small test:

I have 7 files and 22MB.

I load files into memory and do one XPath query using my implementation (MYX) directly from Eclipse 3.0.

running time is 6.062 seconds ~ 6 seconds.

Computer is Intel P4 2.8 GHz, 2GB RAM. Operating system is Windows 2000.

In memory I had Eclipse 3.0, mysql server, VirusScan Enterprise 7.1, Outlook Express, 6 Internet Explorer, 2 Command prompts, 1 notepad

I have when I don't run my program about 1.3 GB free memory and CPU usage of 0-2 procent. When I run my test I have 140MB more memory usage (Windows task menager) and CPU usage of average 40 procent.

If I just copy that files

copy c:\xml_examples\*.xml c:\

it takes 0.44 seconds (almost 1 second) on my machine.

I will try to find some REAL benchmarks.


You'll need a very good join implementation in the database to keep this
fast. On other terms this wouldn't really be a native approach, would
it?

My approach is not so much native. It is "dummy" implementation at the moment.

 


Do you Yahoo!?
vote.yahoo.com - Register online to vote today!