On Tue, Oct 6, 2009 at 11:41 AM, Martin Cracauer <cracauer@...> wrote:
> Scott L. Burson wrote on Tue, Oct 06, 2009 at 11:30:00AM -0700:
>> [Whoops -- sent this to Martin without CC'ing the list.]
>> On Tue, Oct 6, 2009 at 10:34 AM, Martin Cracauer <cracauer@...> wrote:
>> > Scott L. Burson wrote on Tue, Oct 06, 2009 at 09:41:56AM -0700:
>> >> You might be interested in this:
>> >> http://bugzilla.kernel.org/show_bug.cgi?id=5493
>> > We don't see such effects, but we don't have Lisp heaps close to
>> > approaching physical memory.
>> > The guys in the bug report run a 10-50 GB heap on a 4 GB box. That's
>> > gotta be unhealthy. What do they do on a full GC?
>> The "guy in the bug report" is me.
> Sorry :-)
No problem :-)
>> I eventually brought that box up
>> to 24GB, though I guess I hadn't done so yet at the time I first filed
>> the report. (Works okay, with three 15kRPM paging drives. Nowadays
>> one would use SSDs, if budget permits.)
> But then I don't follow why you think the linear scan on the mprotect
> system call is the problem.
Uh, maybe I've been unclear. Certainly I would have had a serious
thrashing problem if I had actually tried to run a 35GB heap in 4GB of
main memory. But I didn't. What I saw was that even at 3 - 3.5GB (in
4GB of memory) the machine was already brought to its knees (one core
was spending 100% of its time in the kernel, and the machine became
progressively unresponsive). Adding RAM didn't change this.
> Or are you saying that the linear scan happens when Linux is looking
> for free pages or tries to free pages?
Something like that, yes.
>> Oh, what I didn't mention (though it's in the bug report, near the
>> bottom) is that I ultimately worked around the problem by switching to
>> Solaris, which has no trace of this problem -- a 35GB heap works just
> So when is this linear scan happening? On the system call or on paging
I don't think it's just on the system call -- IIRC, even when the
process was quiescent, the machine was still bogged down in th kernel.
>> > I doubt that their conclusion that this has to do with the ratio of
>> > RAM to heap is correct, though. It's probably just the number of
>> > mappings that counts here.
>> I believe that's correct.
> But then why did it improve by putting in more RAM?
Oh, I see the confusion. Adding RAM was necessary to avoid thrashing
as the heap grew beyond 4GB (which it could do once I switched to
Solaris), but it didn't fix _this_ problem. I just mentioned that I
had brought the box up to 24GB as a reply to your aside that running a
50GB heap in a 4GB box is not likely to work well.
>> >> Basically, there's code in the Linux kernel that does a linear scan
>> >> through the vma table which holds all these mappings.
>> > Linear scan when? On the mprotect call itself?
>> I think it's somewhere else -- some path the kernel goes through
>> regularly for some reason. Not sure, though.
> I think it is the actual paging activity then. The kernel has memory
> shortage and no reuable pages at the ready.
No, I don't think it is the actual paging activity. No actual paging
activity was required in order for the problem to be observed.
Figuring out what is driving it would require studying the VMA code,
which I started to try to do once, but it's, uh, nontrivial.
> But that would mean that ITA isn't affected since we can't run with
> any paging activity going on anyway.
I think this is not correct, but it may be that given the GC page size
you're using, and your current heap size, you don't see the problem
Since you're at ITA, you might be amused by this historical tidbit.
Solaris actually used to have a very similar problem. Dan Weinreb's
OODB, ObjectStore, tickled it, as it also made very heavy use of page
protection. It was actually an Object Design engineer (possibly Sam
Haradvhala, who had previously been at Symbolics) who rewrote the
relevant parts of the Solaris kernel (with Sun's cooperation,
obviously) so that it would not bog down as the mapping table got
> Still sounds overall fine to me, but it is a further argument for
> mainline SBCL to raise the GC page size to 32 KB. I don't see a
> downside, it's fastest and whatever problem people have with the
> number of mappings is reduced to a fraction.
This makes sense to me too.