From: Leo F. <leo...@ne...> - 2013-04-12 15:28:56
|
Hi Andrius, I don't think so. The big integers are used for the line / column / buffer length numbers no? So if you have large files they will get large. Or am I mistaken? Where did the -16..16 came from? Leo On 12 Apr 2013, at 09:12, Andrius Velykis <and...@ne...<mailto:and...@ne...>> wrote: Hi Leo, (reviving and old post..) As is, the only major problem with memory consumption is the duplication (e.g., ArrayList/Object[]). The presence of various (4-6) BigIntegers in LocAnn is also an issue, but given most of them are shared it's not such a big deal. >From what I gather, BigIntegers are only shared when in the range of [-16, 16] -- looking at BigInteger.valueOf()? Or am I mistaken? If so, most of the BigIntegers would not be shared in CZT.. Andrius On Wed, Jan 25, 2012 at 6:33 AM, Leo Freitas <leo...@ne...<mailto:leo...@ne...>> wrote: Hi Tim, Yes, after parsing. And yes, I found that rather odd too. There is something funny going on in Garbage Collection somewhere. I guess the trouble is that there are certain structures (e.g., TokenSequence iterators) that end up retaining the biggest part of the object memory. Having said that, I was now trying to find the sources of the problem(s) using more intrusive snapshot points (AKA memory dumps at particular execution points). As is, the only major problem with memory consumption is the duplication (e.g., ArrayList/Object[]). The presence of various (4-6) BigIntegers in LocAnn is also an issue, but given most of them are shared it's not such a big deal. I've tweaked the PerformanceSettings constant(s) a little bit more to keep arraylists/object[] capacity 0 at creation, and that just about halved the memory used (!)... possibly at some speed expense given it will take some time to increase capacity. At the profiling sessions done, this wasn't a problem: I've put initial capacity at 0, 1, and 10 (e.g., only very few go beyond that, mainly on type info), and the CPU increase was marginal (e,g., with 10 it was 300ms - 2.75% - quicker; with 1 100ms quicker), yet the memory gain was significant (e.g., with 10 and 1 it was it was twice the heap at 100MB instead of 50MB for 0). One nice thing would be to have this as a configuration file or a SectionManager option.... once all key bottleneck points are identified. By default I will keep it at 0, as the performance penalty in CPU time seems marginal/acceptable (e.g., 2-3% penalty for halving memory needs). Suggestions / comments? Best, Leo On 24 Jan 2012, at 23:11, Tim Miller wrote: > Nice work! > > When you say you added "parser = null", I assume you mean after you do > the parsing? This would mean that, prior to your change, a reference to > the parser is being kept even after the variable scope of the variable > "parser" ends? That seems odd. > > On 25/01/12 01:22, Leo Freitas wrote: >> Tim, >> >> I've forgot to mention below that I was talking about parsing and typechecking only. >> >> I've just run the same profiling setup on the VCG for these larger examples and yes, the worst ones were >> Object[] (24%), char[] (17.3%), ArrayList[] (9%) and BigInteger (2-3%). >> >> Surprising ones were int[] (6.5%) and short[] (6%) arrays, since they are not explicitly created anywhere in CZT. >> LocalAnn took 3%; Iterators were also a surprise: both iterator() and listIterator() cover about 10%, where more >> obvious ones like String (0.9%) and ZName (1.7%) were quite low. >> >> I've run the profiling sessions about 3-5 times on each example taking the average. >> >> ---- >> >> Searching for potential sources of such unexpected types, I managed to find a few successful candidates: >> >> a) ParseUtils >> >> char[], short[], and int[] are mostly present in the Java CUP low-level classes to represent the internal parsing tables from the grammar. >> I simply added an explicit "parser = null" to the main ParseUtils.parse method. These arrays combined account for about 30% of memory. >> >> after the change, these arrays now account for 1.1% (!!!) that's quite an improvement. >> >> b) ListIterator and Iterator >> >> These appear in various places, but places that should influence all runs are only on SmartScanner (and possibly PrettyPrinter). >> After this change, they went from a combined (ListIterator + Iterator) 49% footprint to 17%; another good improvement. >> >> d) ArrayList and Object[] >> >> There are various places and identifying where are the ones of most significance is tedious/time-consuming. >> Instead, I've done a thorough search through various projects and changed default constructors to more sensible values, when possible, >> or to a parameterised default when not (e.g., PerformanceSettings interface in util project). >> >> This led to a decrease in the memory footprint of these objects of.... >> >> e) BigInteger >> >> Why do we need big integers within LocAnn and other places? I guess because long would be potentially small? >> But would there really be something bigger than 2^32 or 2^64 in number of lines of source in a file say? Hum... >> >> In terms of memory footprint, it varied between 0.7% to 3%, depending on the example. I won't change this one for now. >> >> f) Garbage collection time >> >> Firstly it was about 16-20% now it was about 46.1% of time taken. Don't know if that's related to changes, but looks like it. >> >> ====== >> >> Changes a) and b) led to a drastic memory footprint decrease from 8MB to 0.91MB; >> Other changes led to a minor improvement in comparison (i.e., 0.80MB). >> >> That's despite the fact that changes in d) were the most numerous - they didn't amount to much improvement, but some. >> >> Hopefully this will enable much better performance on larger specs. I am committing it now to see how it goes. >> >> Best, >> Leo >> >> On 23 Jan 2012, at 14:00, Leo Freitas wrote: >> >>> Hi Tim, >>> >>> That's interesting. I';ve been doing similar (profiling) tests over specs of some size (e.g., Mondex, Tokeneer, Xenon, IEEE float point unit); >>> although I guess they are smaller than iFACTS - apart from Xenon, which is quite large. >>> >>> On the profiling sessions, the worst culprit was "char[]" arrays, mostly from the java_cup lexer. The Object[] and ArrayList[] were about 2-3% each, >>> for what the char[] was 90% (!)... Similarly, on profiling CPU, it was the IO operations on zzRefill within the java_cup scanner that took the largest chunk >>> of the time (27%). Smart scanning (e.g., lookahead) has taken only about 2-3%. >>> >>> I wonder... what was the profilling setup that you used to get to the creation of Object[] ArrayList as the main problem? >>> Although the change you refer to below shouldn't be relatively simple to change, as Petra pointed out. >>> >>> I just want to get the right picture to tackle such performance problem for larger specs. >>> >>> Best, >>> Leo >>> >>> On 5 Jan 2012, at 23:51, Tim Miller wrote: >>> >>>> Hi everyone, >>>> >>>> Anthony Hall and I have been discussing some memory problems that CZT >>>> has when parsing large specifications. Anthony has been trying to >>>> typecheck the iFacts specification, but without much luck due to the >>>> large memory resources. >>>> >>>> We've each been playing around with VisualVM, and Anthony pointed out is >>>> that the two largest memory hogs are Object[] and ArrayList, taking up >>>> around 23% and 10% of the heap respectively. >>>> Most of the object arrays and ArrayLists contain exactly 10 items, and >>>> almost all items in these lists are null. >>>> >>>> After some poking around, I discovered that when ArrayList is created >>>> using the default constructor, it allocates 10 items initially. This >>>> appears to be where the 10 items come from in each case. I suspect this >>>> may also be contributing to some of the memory problems, considering >>>> that CZT has so many empty annotation lists, etc. Creating ArrayLists >>>> with an initial capacity of 0 or 1 (using the constructor ArrayList(int >>>> initialCapacity)) may give us some substantial space savings >>>> >>>> It appears that most ArrayLists are created in the gnast-generated code. >>>> Petra, how difficult would it be to create these lists using this other >>>> constructor? >>>> >>>> Regards, >>>> Tim >>>> >>>> ------------------------------------------------------------------------------ >>>> Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex >>>> infrastructure or vast IT resources to deliver seamless, secure access to >>>> virtual desktops. With this all-in-one solution, easily deploy virtual >>>> desktops for less than the cost of PCs and save 60% on VDI infrastructure >>>> costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox >>>> _______________________________________________ >>>> CZT-Devel mailing list >>>> CZT...@li...<mailto:CZT...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/czt-devel >>> >>> >>> ------------------------------------------------------------------------------ >>> Try before you buy = See our experts in action! >>> The most comprehensive online learning library for Microsoft developers >>> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, >>> Metro Style Apps, more. Free future releases when you subscribe now! >>> http://p.sf.net/sfu/learndevnow-dev2 >>> _______________________________________________ >>> CZT-Devel mailing list >>> CZT...@li...<mailto:CZT...@li...> >>> https://lists.sourceforge.net/lists/listinfo/czt-devel >> >> > > > ------------------------------------------------------------------------------ > Keep Your Developer Skills Current with LearnDevNow! > The most comprehensive online learning library for Microsoft developers > is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, > Metro Style Apps, more. Free future releases when you subscribe now! > http://p.sf.net/sfu/learndevnow-d2d > _______________________________________________ > CZT-Devel mailing list > CZT...@li...<mailto:CZT...@li...> > https://lists.sourceforge.net/lists/listinfo/czt-devel ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ CZT-Devel mailing list CZT...@li...<mailto:CZT...@li...> https://lists.sourceforge.net/lists/listinfo/czt-devel |