From: Andrius V. <and...@ne...> - 2013-04-12 15:40:22
|
Yes, the big integers will get large. But in your previous e-mail you said "given most of them are shared"? So I wanted to object, that I think they are not actually shared. -16..16 comes from BigInteger.valueOf() method - check the Java sources. I wanted to say that the only sharing of BigInteger objects is in this range.. So I just wanted to say that BigIntegers do consitute a significant part of CZT memory usage :) Andrius On Fri, Apr 12, 2013 at 4:28 PM, Leo Freitas <leo...@ne...>wrote: > Hi Andrius, > > I don't think so. The big integers are used for the line / column / > buffer length numbers no? > So if you have large files they will get large. Or am I mistaken? Where > did the -16..16 came from? > > Leo > > On 12 Apr 2013, at 09:12, Andrius Velykis < > and...@ne...> wrote: > > Hi Leo, > > (reviving and old post..) > > As is, the only major problem with memory consumption is the >> duplication (e.g., ArrayList/Object[]). >> The presence of various (4-6) BigIntegers in LocAnn is also an issue, but >> given most of them are shared >> it's not such a big deal. >> > > From what I gather, BigIntegers are only shared when in the range of > [-16, 16] -- looking at BigInteger.valueOf()? Or am I mistaken? If so, most > of the BigIntegers would not be shared in CZT.. > > Andrius > > > On Wed, Jan 25, 2012 at 6:33 AM, Leo Freitas <leo...@ne...>wrote: > >> Hi Tim, >> >> Yes, after parsing. And yes, I found that rather odd too. There is >> something funny going on >> in Garbage Collection somewhere. I guess the trouble is that there are >> certain structures >> (e.g., TokenSequence iterators) that end up retaining the biggest part of >> the object memory. >> >> Having said that, I was now trying to find the sources of the problem(s) >> using more intrusive >> snapshot points (AKA memory dumps at particular execution points). >> >> As is, the only major problem with memory consumption is the duplication >> (e.g., ArrayList/Object[]). >> The presence of various (4-6) BigIntegers in LocAnn is also an issue, but >> given most of them are shared >> it's not such a big deal. >> >> I've tweaked the PerformanceSettings constant(s) a little bit more to >> keep arraylists/object[] capacity 0 at creation, >> and that just about halved the memory used (!)... possibly at some speed >> expense given it will take some time to >> increase capacity. >> >> At the profiling sessions done, this wasn't a problem: I've put initial >> capacity at 0, 1, and 10 (e.g., only very few go >> beyond that, mainly on type info), and the CPU increase was marginal >> (e,g., with 10 it was 300ms - 2.75% - quicker; with 1 100ms quicker), >> yet the memory gain was significant (e.g., with 10 and 1 it was it was >> twice the heap at 100MB instead of 50MB for 0). >> >> One nice thing would be to have this as a configuration file or a >> SectionManager option.... once all key bottleneck points are identified. >> By default I will keep it at 0, as the performance penalty in CPU time >> seems marginal/acceptable (e.g., 2-3% penalty for halving memory needs). >> >> Suggestions / comments? >> >> Best, >> Leo >> >> On 24 Jan 2012, at 23:11, Tim Miller wrote: >> >> > Nice work! >> > >> > When you say you added "parser = null", I assume you mean after you do >> > the parsing? This would mean that, prior to your change, a reference to >> > the parser is being kept even after the variable scope of the variable >> > "parser" ends? That seems odd. >> > >> > On 25/01/12 01:22, Leo Freitas wrote: >> >> Tim, >> >> >> >> I've forgot to mention below that I was talking about parsing and >> typechecking only. >> >> >> >> I've just run the same profiling setup on the VCG for these larger >> examples and yes, the worst ones were >> >> Object[] (24%), char[] (17.3%), ArrayList[] (9%) and BigInteger (2-3%). >> >> >> >> Surprising ones were int[] (6.5%) and short[] (6%) arrays, since they >> are not explicitly created anywhere in CZT. >> >> LocalAnn took 3%; Iterators were also a surprise: both iterator() and >> listIterator() cover about 10%, where more >> >> obvious ones like String (0.9%) and ZName (1.7%) were quite low. >> >> >> >> I've run the profiling sessions about 3-5 times on each example taking >> the average. >> >> >> >> ---- >> >> >> >> Searching for potential sources of such unexpected types, I managed to >> find a few successful candidates: >> >> >> >> a) ParseUtils >> >> >> >> char[], short[], and int[] are mostly present in the Java CUP >> low-level classes to represent the internal parsing tables from the grammar. >> >> I simply added an explicit "parser = null" to the main >> ParseUtils.parse method. These arrays combined account for about 30% of >> memory. >> >> >> >> after the change, these arrays now account for 1.1% (!!!) that's quite >> an improvement. >> >> >> >> b) ListIterator and Iterator >> >> >> >> These appear in various places, but places that should influence all >> runs are only on SmartScanner (and possibly PrettyPrinter). >> >> After this change, they went from a combined (ListIterator + Iterator) >> 49% footprint to 17%; another good improvement. >> >> >> >> d) ArrayList and Object[] >> >> >> >> There are various places and identifying where are the ones of most >> significance is tedious/time-consuming. >> >> Instead, I've done a thorough search through various projects and >> changed default constructors to more sensible values, when possible, >> >> or to a parameterised default when not (e.g., PerformanceSettings >> interface in util project). >> >> >> >> This led to a decrease in the memory footprint of these objects of.... >> >> >> >> e) BigInteger >> >> >> >> Why do we need big integers within LocAnn and other places? I guess >> because long would be potentially small? >> >> But would there really be something bigger than 2^32 or 2^64 in number >> of lines of source in a file say? Hum... >> >> >> >> In terms of memory footprint, it varied between 0.7% to 3%, depending >> on the example. I won't change this one for now. >> >> >> >> f) Garbage collection time >> >> >> >> Firstly it was about 16-20% now it was about 46.1% of time taken. >> Don't know if that's related to changes, but looks like it. >> >> >> >> ====== >> >> >> >> Changes a) and b) led to a drastic memory footprint decrease from 8MB >> to 0.91MB; >> >> Other changes led to a minor improvement in comparison (i.e., 0.80MB). >> >> >> >> That's despite the fact that changes in d) were the most numerous - >> they didn't amount to much improvement, but some. >> >> >> >> Hopefully this will enable much better performance on larger specs. I >> am committing it now to see how it goes. >> >> >> >> Best, >> >> Leo >> >> >> >> On 23 Jan 2012, at 14:00, Leo Freitas wrote: >> >> >> >>> Hi Tim, >> >>> >> >>> That's interesting. I';ve been doing similar (profiling) tests over >> specs of some size (e.g., Mondex, Tokeneer, Xenon, IEEE float point unit); >> >>> although I guess they are smaller than iFACTS - apart from Xenon, >> which is quite large. >> >>> >> >>> On the profiling sessions, the worst culprit was "char[]" arrays, >> mostly from the java_cup lexer. The Object[] and ArrayList[] were about >> 2-3% each, >> >>> for what the char[] was 90% (!)... Similarly, on profiling CPU, it >> was the IO operations on zzRefill within the java_cup scanner that took the >> largest chunk >> >>> of the time (27%). Smart scanning (e.g., lookahead) has taken only >> about 2-3%. >> >>> >> >>> I wonder... what was the profilling setup that you used to get to the >> creation of Object[] ArrayList as the main problem? >> >>> Although the change you refer to below shouldn't be relatively simple >> to change, as Petra pointed out. >> >>> >> >>> I just want to get the right picture to tackle such performance >> problem for larger specs. >> >>> >> >>> Best, >> >>> Leo >> >>> >> >>> On 5 Jan 2012, at 23:51, Tim Miller wrote: >> >>> >> >>>> Hi everyone, >> >>>> >> >>>> Anthony Hall and I have been discussing some memory problems that CZT >> >>>> has when parsing large specifications. Anthony has been trying to >> >>>> typecheck the iFacts specification, but without much luck due to the >> >>>> large memory resources. >> >>>> >> >>>> We've each been playing around with VisualVM, and Anthony pointed >> out is >> >>>> that the two largest memory hogs are Object[] and ArrayList, taking >> up >> >>>> around 23% and 10% of the heap respectively. >> >>>> Most of the object arrays and ArrayLists contain exactly 10 items, >> and >> >>>> almost all items in these lists are null. >> >>>> >> >>>> After some poking around, I discovered that when ArrayList is created >> >>>> using the default constructor, it allocates 10 items initially. This >> >>>> appears to be where the 10 items come from in each case. I suspect >> this >> >>>> may also be contributing to some of the memory problems, considering >> >>>> that CZT has so many empty annotation lists, etc. Creating ArrayLists >> >>>> with an initial capacity of 0 or 1 (using the constructor >> ArrayList(int >> >>>> initialCapacity)) may give us some substantial space savings >> >>>> >> >>>> It appears that most ArrayLists are created in the gnast-generated >> code. >> >>>> Petra, how difficult would it be to create these lists using this >> other >> >>>> constructor? >> >>>> >> >>>> Regards, >> >>>> Tim >> >>>> >> >>>> >> ------------------------------------------------------------------------------ >> >>>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a >> complex >> >>>> infrastructure or vast IT resources to deliver seamless, secure >> access to >> >>>> virtual desktops. With this all-in-one solution, easily deploy >> virtual >> >>>> desktops for less than the cost of PCs and save 60% on VDI >> infrastructure >> >>>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox >> >>>> _______________________________________________ >> >>>> CZT-Devel mailing list >> >>>> CZT...@li... >> >>>> https://lists.sourceforge.net/lists/listinfo/czt-devel >> >>> >> >>> >> >>> >> ------------------------------------------------------------------------------ >> >>> Try before you buy = See our experts in action! >> >>> The most comprehensive online learning library for Microsoft >> developers >> >>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, >> MVC3, >> >>> Metro Style Apps, more. Free future releases when you subscribe now! >> >>> http://p.sf.net/sfu/learndevnow-dev2 >> >>> _______________________________________________ >> >>> CZT-Devel mailing list >> >>> CZT...@li... >> >>> https://lists.sourceforge.net/lists/listinfo/czt-devel >> >> >> >> >> > >> > >> > >> ------------------------------------------------------------------------------ >> > Keep Your Developer Skills Current with LearnDevNow! >> > The most comprehensive online learning library for Microsoft developers >> > is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, >> > Metro Style Apps, more. Free future releases when you subscribe now! >> > http://p.sf.net/sfu/learndevnow-d2d >> > _______________________________________________ >> > CZT-Devel mailing list >> > CZT...@li... >> > https://lists.sourceforge.net/lists/listinfo/czt-devel >> >> >> >> ------------------------------------------------------------------------------ >> Keep Your Developer Skills Current with LearnDevNow! >> The most comprehensive online learning library for Microsoft developers >> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, >> Metro Style Apps, more. Free future releases when you subscribe now! >> http://p.sf.net/sfu/learndevnow-d2d >> _______________________________________________ >> CZT-Devel mailing list >> CZT...@li... >> https://lists.sourceforge.net/lists/listinfo/czt-devel >> > > > |