Re: [Gtk2hs-users] Trouble compiling gtk2hs

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Sat, 2005-01-29 at 16:57 +0000, Axel Simon wrote:
> I'm just wondering: Could this be a ulimit problem rather than the
> amount of physical memory? John, could you check that as well?

It is possible. My guess was based on the likelihood of the OOM killer
deciding to kill the process since the kernel had over-committed memory
and c2hs being the culprit of the high VM pressure.

> You should certainly stick to 0.9.7 even though c2hs' memory usage is
> really bad. We are still trying to understand what is happening here.

Part of the problem is the binary serialisation patch but it's more than
that.

The binary serialisation generates vast amounts of garbage but doesn't
really retain much memory. That is to say if you set -Hxxxm quite low
but just high enough for the parsing & name analysis to complete then it
is enough for the binary serialisation. That said, with a low heap limit
the serialisation takes much longer because it has to garbage collect so
frequently.

In a bit of testing myself I found that the minimum heap limit that
would still work is -H350m. -H340m died due to heap exhaustion.

The real memory culprit is the c2hs model. It accumulates everything
into these big maps. There are really three phases:
     1. parsing - this just builds the maps without looking anything up
     2. name analysis - this does lots of lookups in the AST and builds
        4 other maps
     3. code generation - does just a few lookups in the 4 maps

The precomp patch does the serialisation after step 2. The memory use
for step 3 is low since only a few things get looked up (and lazily
deserialised).

The first step is not inherently memory hungry. It could operate in
(more-or-less) constant space by serialising AST declarations as it
goes.

The second step is the worst from a memory consumption point of view. It
does lots of lookups and sticks modified references to AST elements in
these other maps. The module is: c2hs/c/CNames.hs which exports just
nameAnalysis. I have not yet looked at it in enough detail to really
understand it. It shouldn't be too hard however, there's only about 100
lines of code. I think what we would want to figure out is whether this
name analysis algorithm really requires everything to be kept in memory
or whether it could be changed to work in a mode where it deserialises a
bit of AST from one file and writes out a bit of a map into another
file, using very little memory in between. Even if it has to keep a
record of which names it has seen before (to detect name clashes), that
would be an improvement since the names are pretty small compared to the
values to which they are the keys.

Duncan