Re: [xmlvm-users] OneJar and XMLVM

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 18 Σεπ 2010, at 2:25 μ.μ., Sascha Haeberling wrote:
> I invested quite a lot of time to improve the internal structures of XMLVM, mainly the UniversalFile API, that makes it now very easy to handle the situation where resources are located either inside or outside the One-JAR.

Exactly that's my point.
I have seen in the source code that you have started to use  UniversalFile which handles this situation, which from my point of view is a bad approach.
And before all this effort grows and be too big and/or important to abandon it, I wanted to present these problems and why (since some time ago) I am against the one-jar approach.

> You said that things grow big. That is natural and as Joshua pointed out, this is not a problem nowadays. Of course, if the One-Jar would become hundreds of MBs big, that would be too much. But right now it doesn't feel too big to download.
> One more note on size: Of course things will get added, but we will hopefully be able to significantly reduce the size of it soon. The OpenJDK libs included contain too much stuff we will never need. Once we figured out what is "junk", we will cut it out and we should be able to get rid of many MBs easily.

Still, the performance loss in terms of speed will always be there.
And as you said exactly, there is a lot of developer effort to support something that (in my point of view) is not a good idea any more. Space is one issue only. Anyway, even space is important, I don't like bloated software just because we can do it. I prefer more elegant solutions, without sacrificing ease of use.

> The other argument you made is speed. Panayotis, you wrote a long e-mail (which you ponted out yourself), but you failed to provide data: What I need to to know are hard facts. Provide data about what takes how long and why this is a problem, and how fast it would be in another case. THEN we we have foundation to talk about and THEN we can maybe find other solutions as well.
> For example: I don't know how the OneJAR mechanism exactly does its work, but I agree with you in that it seams silly to extract parts of the OneJAR that are not needed. E.g. when you just want to cross-compile a bunch of files, you don't need to extract certain libs every time. However, with ZIP files you can extract certain parts only, so it would make sense to have it extract only the needed parts.
> Another way to solve this is to extract things on first run and to cache it on the disk to improve speed when executed again.
> But again, for me what matters are numbers, and not assumptions. If you can provide measurable numbers about what's slow, why it is bad and how fast it could be, then we can talk concrete steps in improving the situation.

I said in a previous post how to get these numbers you are talking about, but, anyway, here they are.
If you want numbers, numbers you will have:

cd build/base/main ; time java -cp main.jar org.xmlvm.proc.NewMain 
0.24 real         0.25 user         0.04 sys

cd build/bin/ ; time java org.xmlvm.proc.NewMain
0.23 real         0.24 user         0.04 sys

cd dist ; time java -jar xmlvm.jar
2.16 real         2.37 user         0.25 sys

It's as you can see, an order of magnitude slower. Are these enough hard facts for you? If you run these scripts hundred times per day, in production, you will have the same feeling as I do. 

What actually surprises me is that the speed of using jars and files on disk takes exactly the  same time. It makes sense of course, since the time we loose to unzip something in memory will equal the time to access different files on disk.