Re: [CZT-Devel] Memory allocation with ArrayLists; arrays; etc.

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Yes, the big integers will get large. But in your previous e-mail you said
"given most of them are shared"? So I wanted to object, that I think they
are not actually shared.

-16..16 comes from BigInteger.valueOf() method - check the Java sources. I
wanted to say that the only sharing of BigInteger objects is in this range..

So I just wanted to say that BigIntegers do consitute a significant part of
CZT memory usage :)

Andrius

On Fri, Apr 12, 2013 at 4:28 PM, Leo Freitas <leo...@ne...>wrote:

>  Hi Andrius,
>
>  I don't think so. The big integers are used for the line / column /
> buffer length numbers no?
> So if you have  large files they will get large. Or am I mistaken? Where
> did the -16..16 came from?
>
>  Leo
>
>  On 12 Apr 2013, at 09:12, Andrius Velykis <
> and...@ne...> wrote:
>
>  Hi Leo,
>
>  (reviving and old post..)
>
>   As is, the only major problem with memory consumption is the
>> duplication (e.g., ArrayList/Object[]).
>> The presence of various (4-6) BigIntegers in LocAnn is also an issue, but
>> given most of them are shared
>> it's not such a big deal.
>>
>
>  From what I gather, BigIntegers are only shared when in the range of
> [-16, 16] -- looking at BigInteger.valueOf()? Or am I mistaken? If so, most
> of the BigIntegers would not be shared in CZT..
>
>  Andrius
>
>
> On Wed, Jan 25, 2012 at 6:33 AM, Leo Freitas <leo...@ne...>wrote:
>
>> Hi Tim,
>>
>> Yes, after parsing. And yes, I found that rather odd too. There is
>> something funny going on
>> in Garbage Collection somewhere. I guess the trouble is that there are
>> certain structures
>> (e.g., TokenSequence iterators) that end up retaining the biggest part of
>> the object memory.
>>
>> Having said that, I was now trying to find the sources of the problem(s)
>> using more intrusive
>> snapshot points (AKA memory dumps at particular execution points).
>>
>> As is, the only major problem with memory consumption is the duplication
>> (e.g., ArrayList/Object[]).
>> The presence of various (4-6) BigIntegers in LocAnn is also an issue, but
>> given most of them are shared
>> it's not such a big deal.
>>
>> I've tweaked the PerformanceSettings constant(s) a little bit more to
>> keep arraylists/object[] capacity 0 at creation,
>> and that just about halved the memory used (!)... possibly at some speed
>> expense given it will take some time to
>> increase capacity.
>>
>> At the profiling sessions done, this wasn't a problem: I've put initial
>> capacity at 0, 1, and 10 (e.g., only very few go
>> beyond that, mainly on type info), and the CPU increase was marginal
>> (e,g., with 10 it was 300ms - 2.75% - quicker; with 1 100ms quicker),
>> yet  the memory gain was significant (e.g., with 10 and 1 it was it was
>> twice the heap at 100MB instead of 50MB for 0).
>>
>> One nice thing would be to have this as a configuration file or a
>> SectionManager option.... once all key bottleneck points are identified.
>> By default I will keep it at 0, as the performance penalty in CPU time
>> seems marginal/acceptable (e.g., 2-3% penalty for halving memory needs).
>>
>> Suggestions / comments?
>>
>> Best,
>> Leo
>>
>> On 24 Jan 2012, at 23:11, Tim Miller wrote:
>>
>> > Nice work!
>> >
>> > When you say you added "parser = null", I assume you mean after you do
>> > the parsing? This would mean that, prior to your change, a reference to
>> > the parser is being kept even after the variable scope of the variable
>> > "parser" ends? That seems odd.
>> >
>> > On 25/01/12 01:22, Leo Freitas wrote:
>> >> Tim,
>> >>
>> >> I've forgot to mention below that I was talking about parsing and
>> typechecking only.
>> >>
>> >> I've just run the same profiling setup on the VCG for these larger
>> examples and yes, the worst ones were
>> >> Object[] (24%), char[] (17.3%), ArrayList[] (9%) and BigInteger (2-3%).
>> >>
>> >> Surprising ones were int[] (6.5%) and short[] (6%) arrays, since they
>> are not explicitly created anywhere in CZT.
>> >> LocalAnn took 3%; Iterators were also a surprise: both iterator() and
>> listIterator() cover about 10%, where more
>> >> obvious ones like String (0.9%) and ZName (1.7%) were quite low.
>> >>
>> >> I've run the profiling sessions about 3-5 times on each example taking
>> the average.
>> >>
>> >> ----
>> >>
>> >> Searching for potential sources of such unexpected types, I managed to
>> find a few successful candidates:
>> >>
>> >> a) ParseUtils
>> >>
>> >> char[], short[], and int[] are mostly present in the Java CUP
>> low-level classes to represent the internal parsing tables from the grammar.
>> >> I simply added an explicit "parser = null" to the main
>> ParseUtils.parse method. These arrays combined account for about 30% of
>> memory.
>> >>
>> >> after the change, these arrays now account for 1.1% (!!!) that's quite
>> an improvement.
>> >>
>> >> b) ListIterator and Iterator
>> >>
>> >> These appear in various places, but places that should influence all
>> runs are only on SmartScanner (and possibly PrettyPrinter).
>> >> After this change, they went from a combined (ListIterator + Iterator)
>> 49% footprint to 17%; another good improvement.
>> >>
>> >> d) ArrayList and Object[]
>> >>
>> >> There are various places and identifying where are the ones of most
>> significance is tedious/time-consuming.
>> >> Instead, I've done a thorough search through various projects and
>> changed default constructors to more sensible values, when possible,
>> >> or to a parameterised default when not (e.g., PerformanceSettings
>> interface in util project).
>> >>
>> >> This led to a decrease in the memory footprint of these objects of....
>> >>
>> >> e) BigInteger
>> >>
>> >> Why do we need big integers within LocAnn and other places? I guess
>> because long would be potentially small?
>> >> But would there really be something bigger than 2^32 or 2^64 in number
>> of lines of source in a file say? Hum...
>> >>
>> >> In terms of memory footprint, it varied between 0.7% to 3%, depending
>> on the example. I won't change this one for now.
>> >>
>> >> f) Garbage collection time
>> >>
>> >> Firstly it was about 16-20% now it was about 46.1% of time taken.
>> Don't know if that's related to changes, but looks like it.
>> >>
>> >> ======
>> >>
>> >> Changes a) and b) led to a drastic memory footprint decrease from 8MB
>> to 0.91MB;
>> >> Other changes led to a minor improvement in comparison (i.e., 0.80MB).
>> >>
>> >> That's despite the fact that changes in d) were the most numerous -
>> they didn't amount to much improvement, but some.
>> >>
>> >> Hopefully this will enable much better performance on larger specs. I
>> am committing it now to see how it goes.
>> >>
>> >> Best,
>> >> Leo
>> >>
>> >> On 23 Jan 2012, at 14:00, Leo Freitas wrote:
>> >>
>> >>> Hi Tim,
>> >>>
>> >>> That's interesting. I';ve been doing similar (profiling) tests over
>> specs of some size (e.g., Mondex, Tokeneer, Xenon, IEEE float point unit);
>> >>> although I guess they are smaller than iFACTS - apart from Xenon,
>> which is quite large.
>> >>>
>> >>> On the profiling sessions, the worst culprit was "char[]" arrays,
>> mostly from the java_cup lexer. The Object[] and ArrayList[] were about
>> 2-3% each,
>> >>> for what the char[] was 90% (!)... Similarly, on profiling CPU, it
>> was the IO operations on zzRefill within the java_cup scanner that took the
>> largest chunk
>> >>> of the time (27%). Smart scanning (e.g., lookahead) has taken only
>> about 2-3%.
>> >>>
>> >>> I wonder... what was the profilling setup that you used to get to the
>> creation of Object[] ArrayList as the main problem?
>> >>> Although the change you refer to below shouldn't be relatively simple
>> to change, as Petra pointed out.
>> >>>
>> >>> I just want to get the right picture to tackle such performance
>> problem for larger specs.
>> >>>
>> >>> Best,
>> >>> Leo
>> >>>
>> >>> On 5 Jan 2012, at 23:51, Tim Miller wrote:
>> >>>
>> >>>> Hi everyone,
>> >>>>
>> >>>> Anthony Hall and I have been discussing some memory problems that CZT
>> >>>> has when parsing large specifications. Anthony has been trying to
>> >>>> typecheck the iFacts specification, but without much luck due to the
>> >>>> large memory resources.
>> >>>>
>> >>>> We've each been playing around with VisualVM, and Anthony pointed
>> out is
>> >>>> that the two largest memory hogs are Object[] and ArrayList, taking
>> up
>> >>>> around 23% and 10% of the heap respectively.
>> >>>> Most of the object arrays and ArrayLists contain exactly 10 items,
>> and
>> >>>> almost all items in these lists are null.
>> >>>>
>> >>>> After some poking around, I discovered that when ArrayList is created
>> >>>> using the default constructor, it allocates 10 items initially. This
>> >>>> appears to be where the 10 items come from in each case. I suspect
>> this
>> >>>> may also be contributing to some of the memory problems, considering
>> >>>> that CZT has so many empty annotation lists, etc. Creating ArrayLists
>> >>>> with an initial capacity of 0 or 1 (using the constructor
>> ArrayList(int
>> >>>> initialCapacity)) may give us some substantial space savings
>> >>>>
>> >>>> It appears that most ArrayLists are created in the gnast-generated
>> code.
>> >>>> Petra, how difficult would it be to create these lists using this
>> other
>> >>>> constructor?
>> >>>>
>> >>>> Regards,
>> >>>> Tim
>> >>>>
>> >>>>
>> ------------------------------------------------------------------------------
>> >>>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a
>> complex
>> >>>> infrastructure or vast IT resources to deliver seamless, secure
>> access to
>> >>>> virtual desktops. With this all-in-one solution, easily deploy
>> virtual
>> >>>> desktops for less than the cost of PCs and save 60% on VDI
>> infrastructure
>> >>>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox
>> >>>> _______________________________________________
>> >>>> CZT-Devel mailing list
>> >>>> CZT...@li...
>> >>>> https://lists.sourceforge.net/lists/listinfo/czt-devel
>> >>>
>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> Try before you buy = See our experts in action!
>> >>> The most comprehensive online learning library for Microsoft
>> developers
>> >>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3,
>> MVC3,
>> >>> Metro Style Apps, more. Free future releases when you subscribe now!
>> >>> http://p.sf.net/sfu/learndevnow-dev2
>> >>> _______________________________________________
>> >>> CZT-Devel mailing list
>> >>> CZT...@li...
>> >>> https://lists.sourceforge.net/lists/listinfo/czt-devel
>> >>
>> >>
>> >
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Keep Your Developer Skills Current with LearnDevNow!
>> > The most comprehensive online learning library for Microsoft developers
>> > is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
>> > Metro Style Apps, more. Free future releases when you subscribe now!
>> > http://p.sf.net/sfu/learndevnow-d2d
>> > _______________________________________________
>> > CZT-Devel mailing list
>> > CZT...@li...
>> > https://lists.sourceforge.net/lists/listinfo/czt-devel
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Keep Your Developer Skills Current with LearnDevNow!
>> The most comprehensive online learning library for Microsoft developers
>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
>> Metro Style Apps, more. Free future releases when you subscribe now!
>> http://p.sf.net/sfu/learndevnow-d2d
>> _______________________________________________
>> CZT-Devel mailing list
>> CZT...@li...
>> https://lists.sourceforge.net/lists/listinfo/czt-devel
>>
>
>
>

Re: [CZT-Devel] Memory allocation with ArrayLists; arrays; etc.

Tool support for the Z formal notation

Re: [CZT-Devel] Memory allocation with ArrayLists; arrays; etc.