From: Jethro R B. <jet...@st...> - 2006-09-28 20:57:37
|
On Thu, 28 Sep 2006, Dominik Gehl wrote: > what exact Perl version are you using ? We have noticed some thread related > issues in the perl version shipped with RHEL4 ... Summary of my perl5 (revision 5 version 8 subversion 6) configuration: Platform: osname=linux, osvers=2.6.9-34.elsmp, archname=i386-linux-thread-multi uname='linux hs20-bc1-4.build.redhat.com 2.6.9-34.elsmp #1 smp fri feb 24 16:56:28 est 2006 i686 i686 i386 gnulinux ' ... > > Dominik > > > Jethro R Binks wrote: > > On Wed, 27 Sep 2006, Kevin Amorin wrote: > > > > > please open a bug report (bugs.packetfence.org) with syslog > > > messages, verbosity=20 and we can work with you on it. I'm running on 10 > > > vlans, 1.6.1 with no problem so we will need to reproduce the issue. > > > > We will see what we can do here (first poster in this thread Mark Meiklejohn > > is one of my colleagues working on this). Meanwhile, we have discovered a > > couple of other issues/bugs while investigating this one: > > > > http://www.packetfence.org/mantis/view.php?id=129 > > > > Working around them seemingly hasn't helped though. > > > > Also while investigating, we noticed that the configuration file is re-read > > frequently, and also that the getlocalmac subroutine calls /sbin/ifconfig > > each time it is called. This seems to be rather inefficient; surely the > > local mac addresses are unlikely to change, so obtaining them at the startup > > and just storing them in a data structure for pfmon to reference would be > > better? > > > > > It feels like a deadlock issue, but we could be wrong. PFMON logs complain > > > about cond_signal() and unlocked vars on lines 488, 717 and 720. > > > > Just to clarify this issue from Mark's post, we subsequently introduced > > locking in (all?) these places, but it didn't make a substantial difference. > > > > We did have a theory at one point that the per-vlan threads are getting > > deadlocked or dying somehow, and once one registration was done per VLAN it > > wouldn't catch any others, however I think we've now seen evidence that that > > isn't the case. > > > > I should also note that after reading back through the list archives, we > > have increased values for kernel.shmmax and kernel.threads-max. The machine > > is well-resourced, running FC4 SMP kernel, with 2 x Intel(R) Xeon(TM) CPU > > 3.40GHz and 2 Gb memory. > > > > Another oddity is in the open violation report; we seem to gather six or so > > MACs which have status open for both "System scan" and "Registration > > completed" - why do these not get closed? > > > > Top generally shows something like: > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 7253 root 16 0 980m 339m 2212 S 53.1 16.8 194:38.14 pfmon > > 5089 mysql 15 0 714m 74m 4576 S 3.3 3.7 113:29.83 mysqld > > > > I don't know if these are reasonable numbers or not for our size of > > installation, around 1200 active nodes at the moment (however less than half > > are registered because of our issues). > > > > We have also added the recent patch regarding problems re-registering when a > > client is deleted from the database. > > > > I have a question though: when opening up node details (Lookup a MAC), > > sometimes an IP address is displayed in parentheses, and sometimes not. I > > assume the difference is that the IP address is only displayed when the node > > is determined to be currently 'active' - is that correct? (And how is > > 'active' defined? Arp or DHCP traffic seen within some interval?). > > > > Would it be worthwhile future enhancement to keep MAC/IP associations > > observed over time within the pf database? To be fair, we already have that > > data in other monitoring systems, but some may find a benefit to having pf > > record that data, and it would save us a looking in a different system for > > the information. > > > > Thanks for any thoughts, > > > > Jethro. > > > > > > > Kevin > > > > > > > > > Bakhtiar A Hamid wrote: > > > > dang.. too early in the morning. resent with my comments below. > > > > > > > > On 9/26/06, Bakhtiar A Hamid <bak...@gm...> wrote: > > > > > > > > > resent to list > > > > > > > > > > On 9/26/06, mark meiklejohn <mme...@ci...> wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > We are running a Linux box that is serving 30 VLANS in passive mode, > > > > > > and we have around circa 2000 users and around half of them have > > > > > > registered with the pf box. > > > > > > > > > > > > The problem appears to be that 'trapmac' fails after 30mins of > > > > > > operation. We saw this problem with rc3, version 1.6.0 and have > > > > > > since upgraded to version 1.6.1 but the problem still persists. > > > > > > However on some occasions it comes back to life for a while and then > > > > > > dies off again. > > > > > > > > > > > > Currently we manually restart pf and this resolves the issue until > > > > > > it fails to trap users again. > > > > > > > > > > > > It feels like a deadlock issue, but we could be wrong. PFMON logs > > > > > > complain about cond_signal() and unlocked vars on lines 488, 717 and > > > > > > 720. > > > > > > > > > > > > Any advice would be appreciated. > > > > > > > > > > > > Mark Meiklejohn > > > > > > > > > > > > > > > > no advice, just confirmation that we saw this in our pf setup > > > > > > > > running centos linux with 6 vlans. > > > > > > > > after pf initially starts, all's well. within 24 hrs, trapmac no > > > > longer is available. only listen_arps are. > > > > > > > > > > > > restarting will make things ok again. > > > > > > > > i do see pfmon complaining of cond_signal() > > > > > > > > we've tried and patch(in this list) and still the same thing > > > > > > > > any help appreciated. > > > > > > > > tia > > > > > > > > > > > > > > > > ------------------------------------------------------------------------- > > > > > > Take Surveys. Earn Cash. Influence the Future of IT > > > > > > Join SourceForge.net's Techsay panel and you'll get the chance to > > > > > > share your > > > > > > opinions on IT & business topics through brief surveys -- and earn > > > > > > cash > > > > > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > > > > > _______________________________________________ > > > > > > Packetfence-devel mailing list > > > > > > Pac...@li... > > > > > > https://lists.sourceforge.net/lists/listinfo/packetfence-devel > > > > > > > > > > > > > > > > > -- > > > > > http://myzope.kedai.com.my - my-zope org > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------------------------- > > > Take Surveys. Earn Cash. Influence the Future of IT > > > Join SourceForge.net's Techsay panel and you'll get the chance to share > > > your > > > opinions on IT & business topics through brief surveys -- and earn cash > > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > > _______________________________________________ > > > Packetfence-devel mailing list > > > Pac...@li... > > > https://lists.sourceforge.net/lists/listinfo/packetfence-devel > > > > > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > > Jethro R Binks > > Computing Officer, IT Services > > University Of Strathclyde, Glasgow, UK > > > > ------------------------------------------------------------------------- > > Take Surveys. Earn Cash. Influence the Future of IT > > Join SourceForge.net's Techsay panel and you'll get the chance to share your > > opinions on IT & business topics through brief surveys -- and earn cash > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > _______________________________________________ > > Packetfence-devel mailing list > > Pac...@li... > > https://lists.sourceforge.net/lists/listinfo/packetfence-devel > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jethro R Binks Computing Officer, IT Services University Of Strathclyde, Glasgow, UK |