From: Ray V. D. <ra...@di...> - 2006-04-05 06:47:56
|
Finally upgraded our Poptop server images to use CentOS 4.3 instead of Fedora Core 2, and upgraded Poptop to 1.3.0 at the same time... Everything seemed to work fine, however once a client connected and began pulling traffic we observed that traffic through the tunnel was moving *very* slowly. Around ~260Kbps max. Tried many things including messing with MTU settings, MSS clamping, changing the kernel (was using a custom 2.6.16 build, also used default Centos 2.6.9.x kernel with DKMS), enabling/disabling auditd, upgraded to pppd 2.4.4... Nothing made a difference. Finally removed pptpd 1.3.0 and installed 1.2.3. Problem went away immediately! As soon as I reverted back to 1.3.0 the slow transfers came back. I compared strace output from the pptpctrl processes from each version: From 1.2.3: Process 14016 attached - interrupt to quit select(8, [0 3 6 7], NULL, NULL, {56, 356000}) = 1 (in [7], left {52, 348000}) read(7, "E\0\0W27\0\0\200/\204\'\300\250\1\226\300\250\00130\1\210"..., 8260) = 87 write(6, "~\377}#\375\220C\357\332\216)i\270}2\276}0\361\327\352"..., 66) = 66 select(8, [0 3 6 7], NULL, NULL, {0, 50000}) = 1 (in [6], left {0, 50000}) read(6, "~\377}#\375\220}%\3511\277\243\254}*\363}8,\346\241{nO"..., 8196) = 73 writev(7, [{"0\201\210\v\0007\200\0\0\0\0\26\0\0\0S", 16}, {"\377\3\375\220\5\3511\277\243\254\n\363\30,\346\241{nO"..., 55}], 2) = 71 select(8, [0 3 6 7], NULL, NULL, {65, 0}) = 1 (in [7], left {65, 0}) read(7, "E\0\0S29\0\0\200/\204)\300\250\1\226\300\250\00130\201"..., 8260) = 83 write(6, "~\377}#\375\220DO\220J}>a\242\231\212\347Y:\\e}+j4\302"..., 54) = 54 select(8, [0 3 6 7], NULL, NULL, {0, 50000}) = 1 (in [7], left {0, 50000}) read(7, "E\0\2\0102;\0\0\200/\202r\300\250\1\226\300\250\00130\1"..., 8260) = 520 write(6, "~\377}#\375\220EY\324\356\235(=;\305\266u\272\310?w=9\202"..., 545) = 545 select(8, [0 3 6 7], NULL, NULL, {0, 50000}) = 1 (in [6], left {0, 48000}) read(6, "\377}#\375\220}&\341}(\347\335\201\3238}6\335}4\322\244"..., 8196) = 3070 writev(7, [{"0\201\210\v\0/\200\0\0\0\0\27\0\0\0U", 16}, {"\377\3\375\220\6\341\10\347\335\201\3238\26\335\24\322"..., 47}], 2) = 63 writev(7, [{"0\1\210\v\5\33\200\0\0\0\0\30", 12}, {"\377\3\375\220\7\273RW\207\371\360\353.\34\377\225\262"..., 1307}], 2) = 1319 writev(7, [{"0\1\210\v\5\33\200\0\0\0\0\31", 12}, {"\377\3\375\220\10\7\303h\260\203\336\274CK\6\241\212\334"..., 1307}], 2) = 1319 From 1.3.0 (with -s 512): [root@localhost i386]# strace -p 31496 -s 512 Process 31496 attached - interrupt to quit select(8, [0 3 6 7], NULL, NULL, {59, 780000}) = 1 (in [7], left {58, 628000}) read(7, "E\0\0\207;T\0\0\200/z\332\300\250\1\226\300\250\00130\1\210\v\0g\0\0\0\0\t\36\377\3\375\230\272\350\27N\212\206H\0\32W\276<\25$f\345Z\32\330R^R0\22\35\360\272\224\265\305\322\367/P\230\34\231?\300UU\340\356\244G\215\27\34\206\251\220\303\25m\262\34D\355B\373vE(\250/\314\365\224u\317\255AI\373\311k\244\322N\355\220\320\213_\266\361\r\314\272\267\301L3\10:\204\202\16C", 8260) = 135 time([1144218604]) = 1144218604 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0 stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=1017, ...}) = 0 send(5, "<31>Apr 4 23:30:04 pptpd[31496]: GRE: accepting packet #2334", 61, MSG_NOSIGNAL) = 61 write(6, "~\377}#\375\230\272\350}7N\212\206H} }:W\276<}5$f\345Z}:\330R^R0}2}=\360\272\224\265\305\322\367/P\230}<\231?\300UU\340\356\244G\215}7}<\206\251\220\303}5m\262}<D\355B\373vE(\250/\314\365\224u\317\255AI\373\311k\244\322N\355\220\320\213_\266\361}-\314\272\267\301L3}(:\204\202}.CP}+~", 124) = 124 My guess is that pptpgre.c is writing every single received GRE packet to syslog, and since I happen to have had a target set up to handle it, it's slowing everything down. :) Note: I do not have debug options specified in pptpd.conf nor in options.pptpd. I imagine the fix for this is to modify pptpgre.c to do some sort of check for the debug option before calling syslog(LOG_DEBUG, ...) ? In the meantime, removing the *.debug target from my syslogd.conf file seems to do the trick as well although I bet the syslog calls are still slowing things down a bit. Ray -- Ray Van Dolson Linux/Unix Systems Administrator Digital Path, Inc. |
From: gARetH b. <hic...@gi...> - 2006-04-05 06:58:55
|
On Tue, 4 Apr 2006, Ray Van Dolson wrote: > My guess is that pptpgre.c is writing every single received GRE packet to > syslog, and since I happen to have had a target set up to handle it, it's > slowing everything down. :) Oh yes, big time, when you get load averages of 10 and you have one pptp going you know you have a problem ... There are two LOG_DEBUG things I remove. diff -r -u pptpd-1.3.1/pptpgre.c pptpd-1.3.1_/pptpgre.c --- pptpd-1.3.1/pptpgre.c 2005-12-29 00:10:32.000000000 +0000 +++ pptpd-1.3.1_/pptpgre.c 2006-02-18 12:27:39.000000000 +0000 @@ -319,7 +319,6 @@ stats.rx_lost += head->seq - gre.seq_recv - 1; syslog(LOG_DEBUG, "GRE: timeout waiting for %d packets", head->seq - gre.seq_recv - 1); } - syslog(LOG_DEBUG, "GRE: accepting #%d from queue", head->seq); gre.seq_recv = head->seq; status = callback(cl, head->packet, head->packlen); pqueue_del(head); @@ -399,7 +398,6 @@ } /* check for out-of-order sequence number */ if (seq_greater(seq, gre.seq_recv)) { - syslog(LOG_DEBUG, "GRE: accepting packet #%d", seq); stats.rx_accepted++; gre.seq_recv = seq; return cb(cl, buffer + ip_len + headersize, payload_len); |
From: James C. <jam...@hp...> - 2006-04-05 07:01:27
|
The patch in CVS only reports the packet in debug mode. -- James Cameron http://quozl.netrek.org/ HP Open Source, Volunteer http://opensource.hp.com/ PPTP Client Project, Release Engineer http://pptpclient.sourceforge.net/ |
From: James C. <jam...@hp...> - 2006-04-05 07:00:10
|
Ray wrote: > My guess is that pptpgre.c is writing every single received GRE packet to > syslog, and since I happen to have had a target set up to handle it, it's > slowing everything down. :) Yes, that seems likely. It would certainly cause an additional context switch on each packet. There is a patch discussed already which is in CVS, yet to be released, probably slated for 1.3.2. Patch is dated 27th March. --=20 James Cameron http://quozl.netrek.org/ HP Open Source, Volunteer http://opensource.hp.com/ PPTP Client Project, Release Engineer http://pptpclient.sourceforge.net/ |
From: Ray V. D. <ra...@di...> - 2006-04-05 07:22:23
|
On Wed, Apr 05, 2006 at 04:59:34PM +1000, James Cameron wrote: > Yes, that seems likely. It would certainly cause an additional context > switch on each packet. There is a patch discussed already which is in > CVS, yet to be released, probably slated for 1.3.2. Patch is dated 27th > March. > Excellent. Thanks James. I've merged this in to my tree. Apologies if I missed discussion on the list regarding this previously. All is well again. Ray |