Re: [Keepalived-devel] KL v1.1.11 on RH ES v4 U2: Simple 2 node Apache Farm
Status: Beta
Brought to you by:
acassen
|
From: Graeme F. <gr...@gr...> - 2006-02-22 10:55:58
|
On Wed 22 Feb 2006 10:26:49 GMT , Shaun McCullagh <sha...@xb...> wrote: <snip> > Initially things looked very good, but both PCs > crashed every few weeks. <snip> > As there is no kernel panic I'm not sure how to go > about investigating this, any suggestions would be most welcome. It sounds to me like the machines are running into some sort of RAM famine *or* a packet reflection issue. Unfortunately these problems can be quite hard to diagnose - and there's definitely not a silver bullet approach, either. Are the crash timings predictable - ie. do they happen within a window of time on a specific day, after a number of days runtime, at a specific time of the month? Can you install the "sar" package and then post-process the data to see what your systems are doing at (and prior to) the time of the failure? Also: as you have a master/backup system which is using DR, you're effectively running two "localnode" servers. Are your machines getting trapped by reflecting packets back and forth to one another? This can happen when server A gets a request to the VIP and forwards it to server B. Server B then forwards it back to server A, rather than getting an application to process it, because the "backup" LVS on server B catches the packet before the application. When the packet reaches server A again, there's an entry already in the LVS table for that connection going to server B, so the packet goes back to server B where the same thing happens. Repeat to fade... You can work around this by using the netfilter "MARK" target, and configuring keepalived to use fwmarks instead of a VIP. Have a look at this thread for more details: http://marc.theaimsgroup.com/?t=113862542800006&r=1&w=2 You may find something useful there, the OP did. Not quite what was intended, but a solution nonetheless. Graeme |