From: Anthony H. <an...@an...> - 2013-04-13 08:47:11
|
Dear Andrius This is brilliant! Since I had a profiler on CZT running for catching some leftover debugging code, I thought I would give a quick look into the CZT memory usage as well. I have discovered that most of the large amount of ZNames that are created are referenced from typechecker Signatures (via NameTypePairs). This is the case because when creating type signatures during typechecking, the Signatures are duplicated in a lot of places. A prominent example of this is in net.sourceforge.czt.typechecker.z.ExprChecker:94 (method visitRefExpr). This bit of code calculates the type of a RefExpr. If a reference is to a Schema, it duplicates the Schema signature (assigns new IDs to all its names) and then uses the new signature within the power type. As a quick test, I tried removing this duplication and instead reuse the original schema signature (with original ZName ids): `Signature sig = signature;` This single change reduced the memory consumption of typechecked `spec.tex` by about 60%! Excellent! This lead me to thinking - do signatures really need to be duplicated everywhere? I imagine that the signature of schema A would be the same everywhere? Can we reuse the signature, or is it important to duplicate and assign new IDs to the name? The massive space saving becomes quite obvious when you think about it. If we have schema references, every RefExpr would duplicate the whole schema definition when typechecked, hence creating a lot of new ZName instances and consuming all this memory. Leo advised that Tim would be the person to ask about the Signature duplication in typechecker? Is it really necessary, or can we reuse the objects? Andrius, this is exactly the sort of thing I hoped might be possible. I don’t know how much of the previous discussion you’re aware of, but at first sight it does look as if objects are being copied many, many times and the total number of objects created seems to be bigger than should be needed, not by a few percent, but by orders of magnitude. If Tim can confirm that these objects and others like them are indeed immutable, then it seems not only more efficient but also the Right Thing simply to copy the object reference. If there are a few more places you can find savings of this sort of percentage then the problem would be fixed. (Two more 60% and you’d have an order of magnitude already!) I have a selfish follow up question. Assuming Tim can confirm that this, or something like it, is acceptable then would you be able to do a bit of work on this? It’s true that I had very diffidently volunteered to look at this area, but it would be clearly much more effective and efficient for you, who obviously know what you are doing, to look at it than for me to spend a lot of time groping around and wasting your time asking dumb questions. Of course I’d be extremely happy to help with any testing, especially if you get to the point where typechecking large specifications becomes feasible (at the moment it simply isn’t possible to typecheck the whole of the iFACTS spec in CZT on any machine I’ve been able to try.) Anyway, I’m really encouraged by this result and thank you so much for doing this – I’m delighted (and impressed!) Anthony |