Milan Zamazal <pdm@...> writes:
> I always get crashes or freezes with the following trivial CLX program
> and all recent versions of SBCL, including 0.8.18, on my Debian x86
> system, Linux 2.6, CLX 0.6. It happens even when both CLX and
> clock.lisp are compiled with (optimize (safety 3)).
>
> It looks like a SBCL bug to me.
>
> In case of crash, the crash is reported as follows:
>
> Argh! gc_find_free_space failed (first_page), nbytes=16.
> Gen Boxed Unboxed LB LUB !move Alloc Waste Trig WP GCs Mem-age
> 0: 129350 0 0 0 0 529397048 420552 2000000 0 0 0.0000
> 1: 1267 399 42 14 264 6942168 111144 8757120 1039 1 3.9396
> 2: 0 0 0 0 0 0 0 2000000 0 0 0.0000
> 3: 0 0 0 0 0 0 0 2000000 0 0 0.0000
> 4: 0 0 0 0 0 0 0 2000000 0 0 0.0000
> 5: 0 0 0 0 0 0 0 2000000 0 0 0.0000
> 6: 0 0 0 0 0 0 0 0 0 0 0.0000
> Total bytes allocated=536339216
> fatal error encountered in SBCL pid 15650
> The system is too badly corrupted or confused to continue at the Lisp
> level. If the system had been compiled with the SB-LDB feature, we'd drop
> into the LDB low-level debugger now. But there's no LDB in this build, so
> we can't really do anything but just exit, sorry.
This, to me, seems to suggest that something is inhibiting garbage
collection. You have 500Mb of allocated stuff in the lowest
generation. Tracking that down would be worthwhile -- to eliminate
conservatism as the problem, try running your program on a non-x86,
where the GC is a precise implementation Cheney's algorithm,
uncontaminated by ambiguous roots. If that likewise crashes with a
heap exhaustion, then look through your code and its callees for
things which would inhibit garbage collection -- without-gcing is the
obvious operator, but there may be others.
> In case of freeze, gdb indicates the freeze happens at the following
> place:
>
> 0x0805864e in futex_wait (lock_word=0x986c514, oldval=-1213677184)
> at linux-os.c:64
> 64 _syscall4(int,sys_futex,
> (gdb) where
> #0 0x0805864e in futex_wait (lock_word=0x986c514, oldval=-1213677184)
> at linux-os.c:64
This could so easily be a bug in the Linux kernel; I don't know enough
about threads to help you here.
Cheers,
Christophe
|