In article <BB005B48-CD0E-4B4E-88E9-6EA01125DD24@...>,
Benjamin Lambert <benlambert@...> wrote:
> I'm running into the old "mprotect call failed with ENOMEM" error, which
> "probably means that the maximum amount of separate memory mappings was
> I've run into a situation where I can't easily change
> "/proc/sys/vm/max_map_count" , and I think I've maxed out
> *backend-page-bytes* at 256k or 1MB or so.
> However, rather than a work-around, I'd like to figure out why my code
> requires so many memory mappings (or why it's stressing SBCL/Linux in the way
> it is). The code is dealing with some gnarly data: lots of strings, arrays,
> and lists. I'm quite sure that my code is not handling this data
> optimally/elegantly. But I'm not sure how to begin debugging this.
mmap is used to grab memory from the OS. That usage is fairly normal, so
I doubt that's ever an issue.
SBCL's usage of mprotect, on the other hand, is very idiosyncratic.
A generational garbage collector is based on the assumption that old
data (that has already been garbage collected at least once) doesn't
change as much as younger data. In order to exploit that assumption,
they need to be able to tell when and which older data have been written
to, and might then point to young data.
Language implementations these days seem to mostly instrument code with
software write barriers. SBCL, CMUCL and Boehm (under certain settings)
instead depend on the hardware MMU to detect writes: pages are write
protected, and writes are logged in the appropriate signal handler
before unprotecting the written page. Unless your code really breaks the
generational assumption, that's probably not too bad, since Linux will
merge mappings. On top of that, SBCL treats unboxed pages (that don't
hold pointers) specially wrt mprotect as they don't need any write
barrier, and tends to allocate them between regular pages, which
If your strings and arrays end up in unboxed pages, that could cause the
problem you're observing.
Ideally, SBCL would be fixed; in the meantime, I can see two avenues.
Unboxed objects like strings and arrays could be explicitly allocated in
the C heap, especially if they're long-lived; a couple people have code
lying around to pretend that these are regular Lisp objects. Otherwise,
it might be possible to slightly modify the generational GC to remove
the write barrier and assume everything has always been written.
Hopefully, someone else has better ideas.