|
From: Bruce S. <bw...@ar...> - 2007-11-06 18:38:09
|
Sounds like something may have a memory leak. When the memory starts
getting low (and before things start aborting), I'd check to see if a
certain process is using a ton of memory.
Maybe run 'top' and sort it by memory usage ("M" key). It wouldn't hurt
to run that periodically to see if a process is slowly using more and
more memory.
Once we figure out where the memory is going, we can try to find a
solution to the problem.
- BS
> Hi Guys,
>
>
> We have around 150 DL boxes out in the wild now, and there may be a
> small problem that keeps happening. Under heavy load the boxes will
> start running low on RAM, at which point we get malloc errors for some
> daemons running on the unit. First thing to go is usual snmpd, which
> isn't a major issue. But further down the line we start loosing things
> like ssh, and in some cases even more crtical processes.
>
> Ok I know this is pretty much the problem with running things from RAM
> disks, but is there a way to ensure that network/tcp connections cannot
> steal all the RAM etc and effectively reserve RAM for the OS?
>
> BTW we aren't being tight with the RAM, most of these boxes have 512Mb
> or a 1GB, which should be able for effectively routing traffic and some
> traffic aggregation.
>
> I welcome any ideas on this front.
>
> Cheers
>
> Mat
|