|
From: Matthew H. <mat...@va...> - 2007-11-06 17:15:18
|
Hi Guys, We have around 150 DL boxes out in the wild now, and there may be a small problem that keeps happening. Under heavy load the boxes will start running low on RAM, at which point we get malloc errors for some daemons running on the unit. First thing to go is usual snmpd, which isn't a major issue. But further down the line we start loosing things like ssh, and in some cases even more crtical processes.=20 Ok I know this is pretty much the problem with running things from RAM disks, but is there a way to ensure that network/tcp connections cannot steal all the RAM etc and effectively reserve RAM for the OS? BTW we aren't being tight with the RAM, most of these boxes have 512Mb or a 1GB, which should be able for effectively routing traffic and some traffic aggregation. I welcome any ideas on this front. Cheers Mat |
|
From: Bruce S. <bw...@ar...> - 2007-11-06 18:38:09
|
Sounds like something may have a memory leak. When the memory starts
getting low (and before things start aborting), I'd check to see if a
certain process is using a ton of memory.
Maybe run 'top' and sort it by memory usage ("M" key). It wouldn't hurt
to run that periodically to see if a process is slowly using more and
more memory.
Once we figure out where the memory is going, we can try to find a
solution to the problem.
- BS
> Hi Guys,
>
>
> We have around 150 DL boxes out in the wild now, and there may be a
> small problem that keeps happening. Under heavy load the boxes will
> start running low on RAM, at which point we get malloc errors for some
> daemons running on the unit. First thing to go is usual snmpd, which
> isn't a major issue. But further down the line we start loosing things
> like ssh, and in some cases even more crtical processes.
>
> Ok I know this is pretty much the problem with running things from RAM
> disks, but is there a way to ensure that network/tcp connections cannot
> steal all the RAM etc and effectively reserve RAM for the OS?
>
> BTW we aren't being tight with the RAM, most of these boxes have 512Mb
> or a 1GB, which should be able for effectively routing traffic and some
> traffic aggregation.
>
> I welcome any ideas on this front.
>
> Cheers
>
> Mat
|
|
From: Serge L. <fi...@in...> - 2007-11-06 19:28:26
|
Hi Matthew, Matthew Hattersley wrote: > > We have around 150 DL boxes out in the wild now, and there may be a > small problem that keeps happening. Under heavy load the boxes will > start running low on RAM, at which point we get malloc errors for some > daemons running on the unit. First thing to go is usual snmpd, which > isn't a major issue. But further down the line we start loosing things > like ssh, and in some cases even more crtical processes. That's weird! I've just finished troubleshooting of terrible memory leaks in our application (not DL related) and faced that net-snmp 5.4.0 eats memory with greate speed (especially libsnmp). The release notes for 5.4.1 show that some leaks were fixed in 5.4.1 ( [BUG 1619827],[PATCH 1616912], [PATCH 1592706]). I think we may update net-snmp to 5.4.1 but I've not checked yet if the update really solves our problem (and I can't guarantee it solves your problem). -- Sincerely, Serge Leschinsky |
|
From: Frank W. <Fra...@ct...> - 2007-11-07 07:28:33
|
On Tuesday 06 November 2007 20:28:07 Serge Leschinsky wrote: > Hi Matthew, > > Matthew Hattersley wrote: > > We have around 150 DL boxes out in the wild now, and there may be a > > small problem that keeps happening. Under heavy load the boxes will > > start running low on RAM, at which point we get malloc errors for some > > daemons running on the unit. First thing to go is usual snmpd, which > > isn't a major issue. But further down the line we start loosing things > > like ssh, and in some cases even more crtical processes. > > That's weird! I've just finished troubleshooting of terrible memory leaks > in our application (not DL related) and faced that net-snmp 5.4.0 eats > memory with greate speed (especially libsnmp). The release notes for 5.4.1 > show that some leaks were fixed in 5.4.1 ( [BUG 1619827],[PATCH 1616912], > [PATCH 1592706]). I think we may update net-snmp to 5.4.1 but I've not > checked yet if the update really solves our problem (and I can't guarantee > it solves your problem). I can confirm the memory leakage in net-snmpd. At the time I discovered it = I=20 googled around a bit, and it seemed to be a well known problem with=20 net-snmpd.=20 snmpd eats up to 90% of memory on my DLs (256 MB RAM), and then stuff start= s=20 to fail. I obviously notice it when ssh fails. As a workaround, I just restart snmpd on all my DLs every once in a while..= =2E.=20 MRTG handles this nicely, and I don't loose to much information. It looks like the problem occurs mostly/sooner on boxes that run ipsec tunn= els=20 too. Matthew: wow, 150 DLs. How do you manage them?=20 Have a nice day, =46rank =2D-=20 _______________________________________________ Centre de Technologie de l'Education 29 avenue John F. Kennedy L-1855 Luxembourg-Kirchberg email: Fra...@ct... t=E9l.: +352 478-5973 fax: +352 333797 _______________________________________________ |
|
From: John J. <jo...@jo...> - 2007-11-07 10:32:48
|
Frank, Thank you very much for this. My DL box was having problems last night and = a reboot solved it. and it was low on memory with INIT constantly re-spawni= ng. And I too use net-snmpd. In your experience, was it enough to restart snmpd and would adding this to= a cron job once a month potentially be enough to avoid restarting the enti= re DL? Regards, John Jore ________________________________________ From: dev...@li... [devil-linux-discus= s-b...@li...] On Behalf Of Frank Weis [Fra...@ct...= ] Sent: 07 November 2007 07:28 To: dev...@li... Subject: Re: [Devil-Linux-discuss] [BULK] Re: RAM Usage On Tuesday 06 November 2007 20:28:07 Serge Leschinsky wrote: > Hi Matthew, > > Matthew Hattersley wrote: > > We have around 150 DL boxes out in the wild now, and there may be a > > small problem that keeps happening. Under heavy load the boxes will > > start running low on RAM, at which point we get malloc errors for some > > daemons running on the unit. First thing to go is usual snmpd, which > > isn't a major issue. But further down the line we start loosing things > > like ssh, and in some cases even more crtical processes. > > That's weird! I've just finished troubleshooting of terrible memory leaks > in our application (not DL related) and faced that net-snmp 5.4.0 eats > memory with greate speed (especially libsnmp). The release notes for 5.4.= 1 > show that some leaks were fixed in 5.4.1 ( [BUG 1619827],[PATCH 1616912]= , > [PATCH 1592706]). I think we may update net-snmp to 5.4.1 but I've not > checked yet if the update really solves our problem (and I can't guarante= e > it solves your problem). I can confirm the memory leakage in net-snmpd. At the time I discovered it = I googled around a bit, and it seemed to be a well known problem with net-snmpd. snmpd eats up to 90% of memory on my DLs (256 MB RAM), and then stuff start= s to fail. I obviously notice it when ssh fails. As a workaround, I just restart snmpd on all my DLs every once in a while..= .. MRTG handles this nicely, and I don't loose to much information. It looks like the problem occurs mostly/sooner on boxes that run ipsec tunn= els too. Matthew: wow, 150 DLs. How do you manage them? Have a nice day, Frank -- _______________________________________________ Centre de Technologie de l'Education 29 avenue John F. Kennedy L-1855 Luxembourg-Kirchberg email: Fra...@ct... t=E9l.: +352 478-5973 fax: +352 333797 _______________________________________________ ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Devil-linux-discuss mailing list Dev...@li... https://lists.sourceforge.net/lists/listinfo/devil-linux-discuss |
|
From: Frank W. <Fra...@ct...> - 2007-11-07 12:23:23
|
Hi John, I restart snmpd more or less once a week. Once a month is definately not=20 enough for some of mine (presumably those that run ipsec tunnels) Regards, =46rank On Wednesday 07 November 2007 11:27:41 John Jore wrote: > Frank, > Thank you very much for this. My DL box was having problems last night and > a reboot solved it. and it was low on memory with INIT constantly > re-spawning. And I too use net-snmpd. > > In your experience, was it enough to restart snmpd and would adding this = to > a cron job once a month potentially be enough to avoid restarting the > entire DL? > > > Regards, > John Jore > > > > ________________________________________ > From: dev...@li... > [dev...@li...] On Behalf Of Frank We= is > [Fra...@ct...] Sent: 07 November 2007 07:28 > To: dev...@li... > Subject: Re: [Devil-Linux-discuss] [BULK] Re: RAM Usage > > On Tuesday 06 November 2007 20:28:07 Serge Leschinsky wrote: > > Hi Matthew, > > > > Matthew Hattersley wrote: > > > We have around 150 DL boxes out in the wild now, and there may be a > > > small problem that keeps happening. Under heavy load the boxes will > > > start running low on RAM, at which point we get malloc errors for some > > > daemons running on the unit. First thing to go is usual snmpd, which > > > isn't a major issue. But further down the line we start loosing things > > > like ssh, and in some cases even more crtical processes. > > > > That's weird! I've just finished troubleshooting of terrible memory lea= ks > > in our application (not DL related) and faced that net-snmp 5.4.0 eats > > memory with greate speed (especially libsnmp). The release notes for > > 5.4.1 show that some leaks were fixed in 5.4.1 ( [BUG 1619827],[PATCH > > 1616912], [PATCH 1592706]). I think we may update net-snmp to 5.4.1 but > > I've not checked yet if the update really solves our problem (and I can= 't > > guarantee it solves your problem). > > I can confirm the memory leakage in net-snmpd. At the time I discovered it > I googled around a bit, and it seemed to be a well known problem with > net-snmpd. > > snmpd eats up to 90% of memory on my DLs (256 MB RAM), and then stuff > starts to fail. I obviously notice it when ssh fails. > > As a workaround, I just restart snmpd on all my DLs every once in a > while.... MRTG handles this nicely, and I don't loose to much information. > > It looks like the problem occurs mostly/sooner on boxes that run ipsec > tunnels too. > > Matthew: wow, 150 DLs. How do you manage them? > > Have a nice day, > > Frank > -- > _______________________________________________ > Centre de Technologie de l'Education > 29 avenue John F. Kennedy > L-1855 Luxembourg-Kirchberg > email: Fra...@ct... > t=E9l.: +352 478-5973 > fax: +352 333797 > _______________________________________________ > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Devil-linux-discuss mailing list > Dev...@li... > https://lists.sourceforge.net/lists/listinfo/devil-linux-discuss > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Devil-linux-discuss mailing list > Dev...@li... > https://lists.sourceforge.net/lists/listinfo/devil-linux-discuss =2D-=20 _______________________________________________ Centre de Technologie de l'Education 29 avenue John F. Kennedy L-1855 Luxembourg-Kirchberg email: Fra...@ct... t=E9l.: +352 478-5973 fax: +352 333797 _______________________________________________ |
|
From: Udo L. <ul...@ab...> - 2007-11-27 13:46:44
|
Hi, i have also a memory leak on one DL 1.2.13 (act as an mailserver). In my case saslauthd need to be restarted (once, better twice a week) and all works fine. Best regards Udo Matthew Hattersley schrieb: > Hi Guys, > > > We have around 150 DL boxes out in the wild now, and there may be a > small problem that keeps happening. Under heavy load the boxes will > start running low on RAM, at which point we get malloc errors for some > daemons running on the unit. First thing to go is usual snmpd, which > isn't a major issue. But further down the line we start loosing things > like ssh, and in some cases even more crtical processes. > > Ok I know this is pretty much the problem with running things from RAM > disks, but is there a way to ensure that network/tcp connections cannot > steal all the RAM etc and effectively reserve RAM for the OS? > > BTW we aren't being tight with the RAM, most of these boxes have 512Mb > or a 1GB, which should be able for effectively routing traffic and some > traffic aggregation. > > I welcome any ideas on this front. > > Cheers > > Mat > > > |