Thread: [SSI-users] Please a bit of help with ha-lvs
Brought to you by:
brucewalker,
rogertsang
From: K P. <ph...@to...> - 2004-11-29 22:01:39
|
I can't seem to get ha-lvs network load balancing to work right. Once the cluster comes up, half the time I can't ssh onto it. I'm assuming that I have some kind of routing or forwarding problem between the director and the real servers. I've read README.CVIP, README.ipvs, README.networking and though most of it I get, some it just seems not quite clear to me. I'd be gratefull if someone could take a sec to review my setup so I could maybe pinpoint what is not right. To get to the point, I'm setting up a 2 node cluster. Both machines are connected to each other using a 1GB ethernet on eth1 on a private subnet 192.168.10.0. Each machine also has a 100Mb ethernet card (eth0) connected to a Lan on subnet 192.168.0.0. I'm running Debian and compiled kernel marked 2.4.22-ac1-ssi. I've setup both machines to be directors (fail-over) as follows: <?xml version="1.0"?> <cvips> <cvip> <ip_addr>192.168.0.125</ip_addr> <director_node> <node_num>1</node_num> <garp_interface>eth0</garp_interface> <sync_interface>eth1</sync_interface> </director_node> <director_node> <node_num>2</node_num> <garp_interface>eth0</garp_interface> <sync_interface>eth1</sync_interface> </director_node> <real_server_node> <node_num>1</node_num> </real_server_node> <real_server_node> <node_num>2</node_num> </real_server_node> </cvip> </cvips> I've tried to setup both with and without NAT. However, from my understanding I shouldn't need NAT because both machines can directly address the external network. So I've got /etc/default/lvs_routing looking like this: LVS_ROUTING=DR LVS_INTERNAL_GW= The gateway on subnet 192.168.0.0 is at 192.168.0.250. #node1:> onall route (node 1) Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.0.125 * 255.255.255.255 UH 0 0 0 eth0 localnet * 255.255.255.0 U 0 0 0 eth0 192.168.10.0 * 255.255.255.0 U 0 0 0 eth1 default 192.168.0.250 0.0.0.0 UG 0 0 0 eth0 (node 2) Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.0.125 * 255.255.255.255 UH 0 0 0 lo localnet * 255.255.255.0 U 0 0 0 eth0 192.168.10.0 * 255.255.255.0 U 0 0 0 eth1 default 192.168.0.250 0.0.0.0 UG 0 0 0 eth0 #node1:> onall ifconfig (node 1) eth0 Link encap:Ethernet HWaddr 00:10:4B:2C:0C:99 inet addr:192.168.0.115 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:40867 errors:0 dropped:0 overruns:0 frame:0 TX packets:29193 errors:0 dropped:0 overruns:0 carrier:0 collisions:77 txqueuelen:1000 RX bytes:20431725 (19.4 MiB) TX bytes:5595501 (5.3 MiB) Interrupt:5 Base address:0xa000 eth0:125 Link encap:Ethernet HWaddr 00:10:4B:2C:0C:99 inet addr:192.168.0.125 Bcast:192.168.0.255 Mask:255.255.255.255 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:5 Base address:0xa000 eth1 Link encap:Ethernet HWaddr 00:0E:0C:61:36:A7 inet addr:192.168.10.1 Bcast:192.168.10.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:240569 errors:0 dropped:0 overruns:0 frame:0 TX packets:258372 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:55417213 (52.8 MiB) TX bytes:123756565 (118.0 MiB) Base address:0xa400 Memory:fb000000-fb020000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:526 errors:0 dropped:0 overruns:0 frame:0 TX packets:526 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:56728 (55.3 KiB) TX bytes:56728 (55.3 KiB) (node 2) eth0 Link encap:Ethernet HWaddr 00:50:04:AD:DF:F3 inet addr:192.168.0.110 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:22476 errors:0 dropped:0 overruns:0 frame:0 TX packets:6754 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:11993237 (11.4 MiB) TX bytes:1937434 (1.8 MiB) Interrupt:5 Base address:0xa000 eth1 Link encap:Ethernet HWaddr 00:0E:0C:61:32:EF inet addr:192.168.10.2 Bcast:192.168.10.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:256443 errors:0 dropped:0 overruns:0 frame:0 TX packets:238636 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:120533361 (114.9 MiB) TX bytes:55301627 (52.7 MiB) Base address:0xa400 Memory:fb000000-fb020000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:12 errors:0 dropped:0 overruns:0 frame:0 TX packets:12 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:840 (840.0 b) TX bytes:840 (840.0 b) lo:125 Link encap:Local Loopback inet addr:192.168.0.125 Mask:255.255.255.255 UP LOOPBACK RUNNING MTU:16436 Metric:1 And here is what ipvadm tells me: #node1:> onall ipvsadm -L (node 1) IP Virtual Server version 1.0.10 (size=65536) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 192.168.0.125:x11 wlc -> node2.touchtunes.com:x11 Route 1 0 0 -> node1.touchtunes.com:x11 Local 1 0 0 TCP 192.168.0.125:ssh wlc -> node2.touchtunes.com:ssh Route 1 0 0 -> node1.touchtunes.com:ssh Local 1 1 0 (node 2) IP Virtual Server version 1.0.10 (size=65536) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 192.168.0.125:x11 wlc -> node1.touchtunes.com:x11 Route 1 0 0 TCP 192.168.0.125:ssh wlc -> node1.touchtunes.com:ssh Route 1 0 0 Now I use an external machine to test with ssh and it simply does not return and ipvadm gives me (while ssh is attempting a connection): #node1:> onall ipvsadm -L (node 1) IP Virtual Server version 1.0.10 (size=65536) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 192.168.0.125:x11 wlc -> node2.touchtunes.com:x11 Route 1 0 0 -> node1.touchtunes.com:x11 Local 1 0 0 TCP 192.168.0.125:ssh wlc -> node2.touchtunes.com:ssh Route 1 0 1 -> node1.touchtunes.com:ssh Local 1 1 0 (node 2) IP Virtual Server version 1.0.10 (size=65536) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 192.168.0.125:x11 wlc -> node1.touchtunes.com:x11 Route 1 0 0 TCP 192.168.0.125:ssh wlc -> node1.touchtunes.com:ssh Route 1 0 0 I'm running sshd from inetd and inetd seems to be running on both nodes as expected and obviously from the above command outputs, i've setup load balancing on ssh. I'm pretty much stumped at this stage. Any help would be greatly appreciated.. Thanks... |
From: John B. <joh...@hp...> - 2004-11-30 03:27:00
|
K Phillips wrote: > I can't seem to get ha-lvs network load balancing to work right. Once the > cluster comes up, half the time I can't ssh onto it. I'm assuming that I have > some kind of routing or forwarding problem between the director and the real > servers. > I've read README.CVIP, README.ipvs, README.networking and though most of it I > get, some it just seems not quite clear to me. I'd be gratefull if someone could > take a sec to review my setup so I could maybe pinpoint what is not right. > > To get to the point, I'm setting up a 2 node cluster. Both machines are > connected to each other using a 1GB ethernet on eth1 on a private subnet > 192.168.10.0. > > Each machine also has a 100Mb ethernet card (eth0) connected to a Lan on subnet > 192.168.0.0. > > I'm running Debian and compiled kernel marked 2.4.22-ac1-ssi. > > I've setup both machines to be directors (fail-over) as follows: > <?xml version="1.0"?> > <cvips> > <cvip> > <ip_addr>192.168.0.125</ip_addr> > <director_node> > <node_num>1</node_num> > <garp_interface>eth0</garp_interface> > <sync_interface>eth1</sync_interface> > </director_node> > <director_node> > <node_num>2</node_num> > <garp_interface>eth0</garp_interface> > <sync_interface>eth1</sync_interface> > </director_node> > <real_server_node> > <node_num>1</node_num> > </real_server_node> > <real_server_node> > <node_num>2</node_num> > </real_server_node> > > </cvip> > </cvips> > > I've tried to setup both with and without NAT. However, from my understanding I > shouldn't need NAT because both machines can directly address the external > network. So I've got /etc/default/lvs_routing looking like this: > LVS_ROUTING=DR > LVS_INTERNAL_GW= > > The gateway on subnet 192.168.0.0 is at 192.168.0.250. > > #node1:> onall route > (node 1) > Kernel IP routing table > Destination Gateway Genmask Flags Metric Ref Use Iface > 192.168.0.125 * 255.255.255.255 UH 0 0 0 eth0 > localnet * 255.255.255.0 U 0 0 0 eth0 > 192.168.10.0 * 255.255.255.0 U 0 0 0 eth1 > default 192.168.0.250 0.0.0.0 UG 0 0 0 eth0 > (node 2) > Kernel IP routing table > Destination Gateway Genmask Flags Metric Ref Use Iface > 192.168.0.125 * 255.255.255.255 UH 0 0 0 lo > localnet * 255.255.255.0 U 0 0 0 eth0 > 192.168.10.0 * 255.255.255.0 U 0 0 0 eth1 > default 192.168.0.250 0.0.0.0 UG 0 0 0 eth0 > > #node1:> onall ifconfig > (node 1) > eth0 Link encap:Ethernet HWaddr 00:10:4B:2C:0C:99 > inet addr:192.168.0.115 Bcast:192.168.0.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > > RX packets:40867 errors:0 dropped:0 overruns:0 frame:0 > TX packets:29193 errors:0 dropped:0 overruns:0 carrier:0 > collisions:77 txqueuelen:1000 > RX bytes:20431725 (19.4 MiB) TX bytes:5595501 (5.3 MiB) > Interrupt:5 Base address:0xa000 > > eth0:125 Link encap:Ethernet HWaddr 00:10:4B:2C:0C:99 > inet addr:192.168.0.125 Bcast:192.168.0.255 Mask:255.255.255.255 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > Interrupt:5 Base address:0xa000 > > eth1 Link encap:Ethernet HWaddr 00:0E:0C:61:36:A7 > inet addr:192.168.10.1 Bcast:192.168.10.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:240569 errors:0 dropped:0 overruns:0 frame:0 > TX packets:258372 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:55417213 (52.8 MiB) TX bytes:123756565 (118.0 MiB) > Base address:0xa400 Memory:fb000000-fb020000 > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:526 errors:0 dropped:0 overruns:0 frame:0 > TX packets:526 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:56728 (55.3 KiB) TX bytes:56728 (55.3 KiB) > > (node 2) > eth0 Link encap:Ethernet HWaddr 00:50:04:AD:DF:F3 > inet addr:192.168.0.110 Bcast:192.168.0.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:22476 errors:0 dropped:0 overruns:0 frame:0 > TX packets:6754 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:11993237 (11.4 MiB) TX bytes:1937434 (1.8 MiB) > Interrupt:5 Base address:0xa000 > > eth1 Link encap:Ethernet HWaddr 00:0E:0C:61:32:EF > inet addr:192.168.10.2 Bcast:192.168.10.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:256443 errors:0 dropped:0 overruns:0 frame:0 > TX packets:238636 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:120533361 (114.9 MiB) TX bytes:55301627 (52.7 MiB) > Base address:0xa400 Memory:fb000000-fb020000 > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:12 errors:0 dropped:0 overruns:0 frame:0 > TX packets:12 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:840 (840.0 b) TX bytes:840 (840.0 b) > > lo:125 Link encap:Local Loopback > inet addr:192.168.0.125 Mask:255.255.255.255 > UP LOOPBACK RUNNING MTU:16436 Metric:1 > > > And here is what ipvadm tells me: > #node1:> onall ipvsadm -L > (node 1) > IP Virtual Server version 1.0.10 (size=65536) > Prot LocalAddress:Port Scheduler Flags > -> RemoteAddress:Port Forward Weight ActiveConn InActConn > TCP 192.168.0.125:x11 wlc > -> node2.touchtunes.com:x11 Route 1 0 0 > -> node1.touchtunes.com:x11 Local 1 0 0 > TCP 192.168.0.125:ssh wlc > -> node2.touchtunes.com:ssh Route 1 0 0 > -> node1.touchtunes.com:ssh Local 1 1 0 > (node 2) > IP Virtual Server version 1.0.10 (size=65536) > Prot LocalAddress:Port Scheduler Flags > -> RemoteAddress:Port Forward Weight ActiveConn InActConn > TCP 192.168.0.125:x11 wlc > -> node1.touchtunes.com:x11 Route 1 0 0 > TCP 192.168.0.125:ssh wlc > -> node1.touchtunes.com:ssh Route 1 0 0 > > > Now I use an external machine to test with ssh and it simply does not return and > ipvadm gives me (while ssh is attempting a connection): > #node1:> onall ipvsadm -L > (node 1) > IP Virtual Server version 1.0.10 (size=65536) > Prot LocalAddress:Port Scheduler Flags > -> RemoteAddress:Port Forward Weight ActiveConn InActConn > TCP 192.168.0.125:x11 wlc > -> node2.touchtunes.com:x11 Route 1 0 0 > -> node1.touchtunes.com:x11 Local 1 0 0 > TCP 192.168.0.125:ssh wlc > -> node2.touchtunes.com:ssh Route 1 0 1 > -> node1.touchtunes.com:ssh Local 1 1 0 > (node 2) > IP Virtual Server version 1.0.10 (size=65536) > Prot LocalAddress:Port Scheduler Flags > -> RemoteAddress:Port Forward Weight ActiveConn InActConn > TCP 192.168.0.125:x11 wlc > -> node1.touchtunes.com:x11 Route 1 0 0 > TCP 192.168.0.125:ssh wlc > -> node1.touchtunes.com:ssh Route 1 0 0 > > I'm running sshd from inetd and inetd seems to be running on both nodes as > expected and obviously from the above command outputs, i've setup load balancing > on ssh. I'm pretty much stumped at this stage. > > Any help would be greatly appreciated.. > > Thanks... > > A couple of basic things Bruce and I would like you to try: 1.) Can you ssh directly to the node 2 via the 192.168.0.110 address? 2.) Could you run "tcpdump port 22" as root on node 2 and see the packets being received/sent when you try to ssh to the CVIP address? (At least when the connection is not obviously being formed on node 1.) If you don't have tcpdump installed, "apt-get install tcpdump" should install it if you have apt set up. If you do see ssh packets being sent to node 2, but no responses being sent, you can try to strace inetd as it spawns the sshd. "strace -fF -o strace.out -p <pid of inetd>" should show you what is going on. Let us know what you find and we'll see what we can do. John |
From: Aneesh K. K.V <ane...@hp...> - 2004-11-30 09:12:53
|
K Phillips wrote: > I can't seem to get ha-lvs network load balancing to work right. Once the > cluster comes up, half the time I can't ssh onto it. I'm assuming that I have > some kind of routing or forwarding problem between the director and the real > servers. > On debian i had the report that the default configuration with which debian is installed doesn't work when you have more than one network card. You can find more details in the archives. For the quick check make sure /proc/sys/net/ipv4/conf/all/rp_filter is zero. If this works for you do let me know. I will submit the fix to cvs. -aneesh |
From: Ken P. <ph...@to...> - 2004-11-30 13:59:49
|
Hmmm...what do you know. It works! Thanks Aneesh. Your work and help is much appreciated. On Tue, 2004-11-30 at 14:43 +0530, Aneesh Kumar K.V wrote: > K Phillips wrote: > > I can't seem to get ha-lvs network load balancing to work right. Once the > > cluster comes up, half the time I can't ssh onto it. I'm assuming that I have > > some kind of routing or forwarding problem between the director and the real > > servers. > > > > > On debian i had the report that the default configuration with which > debian is installed doesn't work when you have more than one network > card. You can find more details in the archives. For the quick check > make sure > > /proc/sys/net/ipv4/conf/all/rp_filter is zero. > > If this works for you do let me know. I will submit the fix to cvs. > > -aneesh > -- Ken Phillips <ph...@to...> |