Thread: [VirtualGL-Users] VNC session seems to be locked
3D Without Boundaries
Brought to you by:
dcommander
From: Rafael G. <rg...@gm...> - 2013-10-23 18:37:02
|
Hi, I have created a VNC session and after using it for sometime, it seems to be "locked". It is configured to use one-time-passwords and I just can't generate a new otp for access the session, vncpasswd hangs indefinitely when trying to get a new password. The session was created by the following command line: /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd /opt/TurboVNC/bin/../vnc/classes -auth /u/cnhu/.Xauthority -dontdisconnect -geometry 3192x1046 -depth 24 -rfbwait 120000 -otpauth -rfbport 5913 -fp /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 -co /usr/share/X11/rgb -deferupdate 1 When trying to launch a "vncpasswd -otp -display :13", I get no response and eventually I have to press Ctrl+C. I have runned the vncpasswd command line with strace and it seems to hang while trying to connect to Xvnc through the "/tmp/.X11-unix/X13" socket (not sure if I've got it right). Below you may check the end of the strace (until it hangs). What may the problem be? The same happened twice in two different machines. I have several other VNC sessions where this does not happen. Can I collect information from anywhere else that could help me trying to figure out what happened? uname({sys="Linux", node="server03", ...}) = 0 socket(PF_FILE, SOCK_STREAM, 0) = 3 uname({sys="Linux", node="server03", ...}) = 0 uname({sys="Linux", node="server03", ...}) = 0 connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 uname({sys="Linux", node="server03", ...}) = 0 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 access("/u/cnhu/.Xauthority", R_OK) = 0 open("/u/cnhu/.Xauthority", O_RDONLY) = 4 fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2ab9583b5000 read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 read(4, "", 32768) = 0 close(4) = 0 munmap(0x2ab9583b5000, 32768) = 0 writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> Cheers, Rafael |
From: DRC <dco...@us...> - 2013-10-24 03:07:47
|
Can you figure out where in the vncpasswd code it's locking up? That would help diagnose the issue. If you're using an RPM-based system, you can install the turbovnc-debuginfo RPM and get a stack trace from gdb when it locks up. Otherwise, you'll have to build vncpasswd from source, but it's straightforward. This is a complete shot in the dark, but I'm wondering if maybe the /dev/urandom device is somehow giving you problems on some of your machines. vncpasswd will read from /dev/urandom when it generates an OTP. If that is the problem, then adding #undef UseDevUrandom to the top of vncpasswd.c should temporarily work around it, and it would be easy to add a more permanent workaround (a command line switch that avoids using /dev/urandom.) On 10/23/13 1:36 PM, Rafael Guimaraes wrote: > Hi, > > I have created a VNC session and after using it for sometime, it seems > to be "locked". It is configured to use one-time-passwords and I just > can't generate a new otp for access the session, vncpasswd hangs > indefinitely when trying to get a new password. > > The session was created by the following command line: > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd > /opt/TurboVNC/bin/../vnc/classes -auth /u/cnhu/.Xauthority > -dontdisconnect -geometry 3192x1046 -depth 24 -rfbwait 120000 -otpauth > -rfbport 5913 -fp > /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 > -co /usr/share/X11/rgb -deferupdate 1 > > When trying to launch a "vncpasswd -otp -display :13", I get no response > and eventually I have to press Ctrl+C. I have runned the vncpasswd > command line with strace and it seems to hang while trying to connect to > Xvnc through the "/tmp/.X11-unix/X13" socket (not sure if I've got it > right). Below you may check the end of the strace (until it hangs). > > What may the problem be? The same happened twice in two different > machines. I have several other VNC sessions where this does not happen. > Can I collect information from anywhere else that could help me trying > to figure out what happened? > > uname({sys="Linux", node="server03", ...}) = 0 > socket(PF_FILE, SOCK_STREAM, 0) = 3 > uname({sys="Linux", node="server03", ...}) = 0 > uname({sys="Linux", node="server03", ...}) = 0 > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 > uname({sys="Linux", node="server03", ...}) = 0 > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > access("/u/cnhu/.Xauthority", R_OK) = 0 > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 > mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > 0) = 0x2ab9583b5000 > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 > read(4, "", 32768) = 0 > close(4) = 0 > munmap(0x2ab9583b5000, 32768) = 0 > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", > 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN (Resource > temporarily unavailable) > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> |
From: Rafael G. <rg...@gm...> - 2013-10-24 12:56:39
|
Hi DRC, I have followed the best debugging practice there is (don't worry, I'm being ironic :)), putting printfs around the possible problem and I figured out that the vncpasswd hangs when calling XOpenDisplay. When executing the code below, it prints "Opening display :13" and nothing else. If I run it with strace, I get the trace below. I didn't do the gdb test just because I am not very familiar with doing it by gdb (I've already done it before, but there has been centuries). If the following information is not enough and gdb would help, I am sure I can handle it, no problem. I am just trying to make thing easier for me! A little bit egocentric, I know, but I just can't help it... :) If you check the strace output, it hangs on the last line, when I seems to be trying to communicate through a socket "/tmp/.X11-unix/X13". I imagine it is trying to reach Xvnc on display 13 and it is getting no response, so it keeps waiting forever... Am I right? What should I do if this is the case? Cheers, Rafael VNCPASSWD CODE WITH PRINTFS: int DoOTP() { unsigned int full; unsigned int view = 0; Display* dpy; Atom prop; int len; char buf[MAXPWLEN + 1]; char bytes[MAXPWLEN * 2]; #ifdef UseDevUrandom int fd; #endif *printf("Opening display %s\n",displayname);* if ((dpy = XOpenDisplay(displayname)) == NULL) { fprintf(stderr, "unable to open display \"%s\"\n", XDisplayName(displayname)); return(1); } *printf("Display opened\n");* STRACE: write(1, "Opening display :13\n", 20Opening display :13 ) = 20 brk(0) = 0x59f3000 brk(0x5a14000) = 0x5a14000 uname({sys="Linux", node="server03", ...}) = 0 socket(PF_FILE, SOCK_STREAM, 0) = 3 uname({sys="Linux", node="server03", ...}) = 0 uname({sys="Linux", node="server03", ...}) = 0 connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 uname({sys="Linux", node="server03", ...}) = 0 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 access("/u/cnhu/.Xauthority", R_OK) = 0 open("/u/cnhu/.Xauthority", O_RDONLY) = 4 fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 read(4, "", 32768) = 0 close(4) = 0 munmap(0x2b34aa97f000, 32768) = 0 writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource temporarily unavailable) poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> 2013/10/24 DRC <dco...@us...> > Can you figure out where in the vncpasswd code it's locking up? That > would help diagnose the issue. If you're using an RPM-based system, you > can install the turbovnc-debuginfo RPM and get a stack trace from gdb > when it locks up. Otherwise, you'll have to build vncpasswd from > source, but it's straightforward. > > This is a complete shot in the dark, but I'm wondering if maybe the > /dev/urandom device is somehow giving you problems on some of your > machines. vncpasswd will read from /dev/urandom when it generates an > OTP. If that is the problem, then adding > > #undef UseDevUrandom > > to the top of vncpasswd.c should temporarily work around it, and it > would be easy to add a more permanent workaround (a command line switch > that avoids using /dev/urandom.) > > > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: > > Hi, > > > > I have created a VNC session and after using it for sometime, it seems > > to be "locked". It is configured to use one-time-passwords and I just > > can't generate a new otp for access the session, vncpasswd hangs > > indefinitely when trying to get a new password. > > > > The session was created by the following command line: > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd > > /opt/TurboVNC/bin/../vnc/classes -auth /u/cnhu/.Xauthority > > -dontdisconnect -geometry 3192x1046 -depth 24 -rfbwait 120000 -otpauth > > -rfbport 5913 -fp > > > /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 > > -co /usr/share/X11/rgb -deferupdate 1 > > > > When trying to launch a "vncpasswd -otp -display :13", I get no response > > and eventually I have to press Ctrl+C. I have runned the vncpasswd > > command line with strace and it seems to hang while trying to connect to > > Xvnc through the "/tmp/.X11-unix/X13" socket (not sure if I've got it > > right). Below you may check the end of the strace (until it hangs). > > > > What may the problem be? The same happened twice in two different > > machines. I have several other VNC sessions where this does not happen. > > Can I collect information from anywhere else that could help me trying > > to figure out what happened? > > > > uname({sys="Linux", node="server03", ...}) = 0 > > socket(PF_FILE, SOCK_STREAM, 0) = 3 > > uname({sys="Linux", node="server03", ...}) = 0 > > uname({sys="Linux", node="server03", ...}) = 0 > > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 > > uname({sys="Linux", node="server03", ...}) = 0 > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > > access("/u/cnhu/.Xauthority", R_OK) = 0 > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > > 0) = 0x2ab9583b5000 > > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 > > read(4, "", 32768) = 0 > > close(4) = 0 > > munmap(0x2ab9583b5000, 32768) = 0 > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", > > 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > > read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN (Resource > > temporarily unavailable) > > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > |
From: Rafael G. <rg...@gm...> - 2013-10-24 14:04:08
|
Hi DRC, I overcame my laziness and launched vncpasswd with gdb. The result was pretty much the same of what strace and printfs have shown. The problem really seems to be in the communication with the Xvnc server... Cheers, Rafael Reading symbols from /opt/TurboVNC/bin/vncpasswd...Reading symbols from /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. done. (gdb) r -o -display :13 Starting program: /opt/TurboVNC/bin/vncpasswd -o -display :13 Program received signal SIGINT, Interrupt. 0x000000371f6cc30f in poll () from /lib64/libc.so.6 (gdb) bt #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 #1 0x000000372124a9f0 in ?? () from /usr/lib64/libX11.so.6 #2 0x000000372124ae19 in _XRead () from /usr/lib64/libX11.so.6 #3 0x00000037212378c9 in XOpenDisplay () from /usr/lib64/libX11.so.6 #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 #5 0x0000000000402187 in main (argc=4, argv=0x7fffffffe748) at vncpasswd.c:293 (gdb) 2013/10/24 Rafael Guimaraes <rg...@gm...> > Hi DRC, > > I have followed the best debugging practice there is (don't worry, I'm > being ironic :)), putting printfs around the possible problem and I figured > out that the vncpasswd hangs when calling XOpenDisplay. When executing the > code below, it prints "Opening display :13" and nothing else. If I run it > with strace, I get the trace below. > > I didn't do the gdb test just because I am not very familiar with doing it > by gdb (I've already done it before, but there has been centuries). If the > following information is not enough and gdb would help, I am sure I can > handle it, no problem. I am just trying to make thing easier for me! A > little bit egocentric, I know, but I just can't help it... :) > > If you check the strace output, it hangs on the last line, when I seems to > be trying to communicate through a socket "/tmp/.X11-unix/X13". I imagine > it is trying to reach Xvnc on display 13 and it is getting no response, so > it keeps waiting forever... Am I right? What should I do if this is the > case? > > Cheers, > > Rafael > > VNCPASSWD CODE WITH PRINTFS: > > int DoOTP() > { > unsigned int full; > unsigned int view = 0; > Display* dpy; > Atom prop; > int len; > char buf[MAXPWLEN + 1]; > char bytes[MAXPWLEN * 2]; > #ifdef UseDevUrandom > int fd; > #endif > > *printf("Opening display %s\n",displayname);* > if ((dpy = XOpenDisplay(displayname)) == NULL) { > fprintf(stderr, "unable to open display \"%s\"\n", > XDisplayName(displayname)); > return(1); > } > *printf("Display opened\n");* > > > STRACE: > > write(1, "Opening display :13\n", 20Opening display :13 > ) = 20 > brk(0) = 0x59f3000 > brk(0x5a14000) = 0x5a14000 > uname({sys="Linux", node="server03", ...}) = 0 > socket(PF_FILE, SOCK_STREAM, 0) = 3 > uname({sys="Linux", node="server03", ...}) = 0 > uname({sys="Linux", node="server03", ...}) = 0 > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 > uname({sys="Linux", node="server03", ...}) = 0 > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > access("/u/cnhu/.Xauthority", R_OK) = 0 > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 > mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) > = 0x2b34aa97f000 > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 > read(4, "", 32768) = 0 > close(4) = 0 > munmap(0x2b34aa97f000, 32768) = 0 > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", 18}, > {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource temporarily > unavailable) > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > > > > > 2013/10/24 DRC <dco...@us...> > >> Can you figure out where in the vncpasswd code it's locking up? That >> would help diagnose the issue. If you're using an RPM-based system, you >> can install the turbovnc-debuginfo RPM and get a stack trace from gdb >> when it locks up. Otherwise, you'll have to build vncpasswd from >> source, but it's straightforward. >> >> This is a complete shot in the dark, but I'm wondering if maybe the >> /dev/urandom device is somehow giving you problems on some of your >> machines. vncpasswd will read from /dev/urandom when it generates an >> OTP. If that is the problem, then adding >> >> #undef UseDevUrandom >> >> to the top of vncpasswd.c should temporarily work around it, and it >> would be easy to add a more permanent workaround (a command line switch >> that avoids using /dev/urandom.) >> >> >> On 10/23/13 1:36 PM, Rafael Guimaraes wrote: >> > Hi, >> > >> > I have created a VNC session and after using it for sometime, it seems >> > to be "locked". It is configured to use one-time-passwords and I just >> > can't generate a new otp for access the session, vncpasswd hangs >> > indefinitely when trying to get a new password. >> > >> > The session was created by the following command line: >> > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd >> > /opt/TurboVNC/bin/../vnc/classes -auth /u/cnhu/.Xauthority >> > -dontdisconnect -geometry 3192x1046 -depth 24 -rfbwait 120000 -otpauth >> > -rfbport 5913 -fp >> > >> /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 >> > -co /usr/share/X11/rgb -deferupdate 1 >> > >> > When trying to launch a "vncpasswd -otp -display :13", I get no response >> > and eventually I have to press Ctrl+C. I have runned the vncpasswd >> > command line with strace and it seems to hang while trying to connect to >> > Xvnc through the "/tmp/.X11-unix/X13" socket (not sure if I've got it >> > right). Below you may check the end of the strace (until it hangs). >> > >> > What may the problem be? The same happened twice in two different >> > machines. I have several other VNC sessions where this does not happen. >> > Can I collect information from anywhere else that could help me trying >> > to figure out what happened? >> > >> > uname({sys="Linux", node="server03", ...}) = 0 >> > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> > access("/u/cnhu/.Xauthority", R_OK) = 0 >> > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 >> > mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >> > 0) = 0x2ab9583b5000 >> > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = >> 13139 >> > read(4, "", 32768) = 0 >> > close(4) = 0 >> > munmap(0x2ab9583b5000, 32768) = 0 >> > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", >> > 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> > read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN (Resource >> > temporarily unavailable) >> > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> >> >> >> ------------------------------------------------------------------------------ >> October Webinars: Code for Performance >> Free Intel webinars can help you accelerate application performance. >> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >> from >> the latest Intel processors and coprocessors. See abstracts and register > >> >> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >> _______________________________________________ >> VirtualGL-Users mailing list >> Vir...@li... >> https://lists.sourceforge.net/lists/listinfo/virtualgl-users >> > > |
From: Rafael G. <rg...@gm...> - 2013-10-24 14:27:24
|
Hi DRC, Just some last information... I have tried to telnet Xvnc (in ports 5813 and 5913, since I using display 13) in order to see if the Xvnc process is completely locked or not and it seems to be responding to my connections correclty. I have opened a telnet to port 5813, sent a "GET /VncViewer.jar" and was able to download the applet. Then I opened a telnet to port 5913, received "RFB 003.008", send "RFB 003.003" and got some binary info I was not able to figure out... But it seems to be responding correctly, it is not completely locked. I don't know what further tests should I do for finding out what's happening. Since it has already happened twice, in two different machines, I would like to know as much as I possible in order to avoid it from happening again in the future... Cheers, Rafael 2013/10/24 Rafael Guimaraes <rg...@gm...> > Hi DRC, > > I overcame my laziness and launched vncpasswd with gdb. The result was > pretty much the same of what strace and printfs have shown. The problem > really seems to be in the communication with the Xvnc server... > > Cheers, > > Rafael > > > > Reading symbols from /opt/TurboVNC/bin/vncpasswd...Reading symbols from > /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. > done. > (gdb) r -o -display :13 > Starting program: /opt/TurboVNC/bin/vncpasswd -o -display :13 > > Program received signal SIGINT, Interrupt. > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > (gdb) bt > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > #1 0x000000372124a9f0 in ?? () from /usr/lib64/libX11.so.6 > #2 0x000000372124ae19 in _XRead () from /usr/lib64/libX11.so.6 > #3 0x00000037212378c9 in XOpenDisplay () from /usr/lib64/libX11.so.6 > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 > #5 0x0000000000402187 in main (argc=4, argv=0x7fffffffe748) at > vncpasswd.c:293 > (gdb) > > > > 2013/10/24 Rafael Guimaraes <rg...@gm...> > >> Hi DRC, >> >> I have followed the best debugging practice there is (don't worry, I'm >> being ironic :)), putting printfs around the possible problem and I figured >> out that the vncpasswd hangs when calling XOpenDisplay. When executing the >> code below, it prints "Opening display :13" and nothing else. If I run it >> with strace, I get the trace below. >> >> I didn't do the gdb test just because I am not very familiar with doing >> it by gdb (I've already done it before, but there has been centuries). If >> the following information is not enough and gdb would help, I am sure I can >> handle it, no problem. I am just trying to make thing easier for me! A >> little bit egocentric, I know, but I just can't help it... :) >> >> If you check the strace output, it hangs on the last line, when I seems >> to be trying to communicate through a socket "/tmp/.X11-unix/X13". I >> imagine it is trying to reach Xvnc on display 13 and it is getting no >> response, so it keeps waiting forever... Am I right? What should I do if >> this is the case? >> >> Cheers, >> >> Rafael >> >> VNCPASSWD CODE WITH PRINTFS: >> >> int DoOTP() >> { >> unsigned int full; >> unsigned int view = 0; >> Display* dpy; >> Atom prop; >> int len; >> char buf[MAXPWLEN + 1]; >> char bytes[MAXPWLEN * 2]; >> #ifdef UseDevUrandom >> int fd; >> #endif >> >> *printf("Opening display %s\n",displayname);* >> if ((dpy = XOpenDisplay(displayname)) == NULL) { >> fprintf(stderr, "unable to open display \"%s\"\n", >> XDisplayName(displayname)); >> return(1); >> } >> *printf("Display opened\n");* >> >> >> STRACE: >> >> write(1, "Opening display :13\n", 20Opening display :13 >> ) = 20 >> brk(0) = 0x59f3000 >> brk(0x5a14000) = 0x5a14000 >> uname({sys="Linux", node="server03", ...}) = 0 >> socket(PF_FILE, SOCK_STREAM, 0) = 3 >> uname({sys="Linux", node="server03", ...}) = 0 >> uname({sys="Linux", node="server03", ...}) = 0 >> connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 >> uname({sys="Linux", node="server03", ...}) = 0 >> fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> access("/u/cnhu/.Xauthority", R_OK) = 0 >> open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 >> mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) >> = 0x2b34aa97f000 >> read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 >> read(4, "", 32768) = 0 >> close(4) = 0 >> munmap(0x2b34aa97f000, 32768) = 0 >> writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", 18}, >> {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource temporarily >> unavailable) >> poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> >> >> >> >> >> 2013/10/24 DRC <dco...@us...> >> >>> Can you figure out where in the vncpasswd code it's locking up? That >>> would help diagnose the issue. If you're using an RPM-based system, you >>> can install the turbovnc-debuginfo RPM and get a stack trace from gdb >>> when it locks up. Otherwise, you'll have to build vncpasswd from >>> source, but it's straightforward. >>> >>> This is a complete shot in the dark, but I'm wondering if maybe the >>> /dev/urandom device is somehow giving you problems on some of your >>> machines. vncpasswd will read from /dev/urandom when it generates an >>> OTP. If that is the problem, then adding >>> >>> #undef UseDevUrandom >>> >>> to the top of vncpasswd.c should temporarily work around it, and it >>> would be easy to add a more permanent workaround (a command line switch >>> that avoids using /dev/urandom.) >>> >>> >>> On 10/23/13 1:36 PM, Rafael Guimaraes wrote: >>> > Hi, >>> > >>> > I have created a VNC session and after using it for sometime, it seems >>> > to be "locked". It is configured to use one-time-passwords and I just >>> > can't generate a new otp for access the session, vncpasswd hangs >>> > indefinitely when trying to get a new password. >>> > >>> > The session was created by the following command line: >>> > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd >>> > /opt/TurboVNC/bin/../vnc/classes -auth /u/cnhu/.Xauthority >>> > -dontdisconnect -geometry 3192x1046 -depth 24 -rfbwait 120000 -otpauth >>> > -rfbport 5913 -fp >>> > >>> /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 >>> > -co /usr/share/X11/rgb -deferupdate 1 >>> > >>> > When trying to launch a "vncpasswd -otp -display :13", I get no >>> response >>> > and eventually I have to press Ctrl+C. I have runned the vncpasswd >>> > command line with strace and it seems to hang while trying to connect >>> to >>> > Xvnc through the "/tmp/.X11-unix/X13" socket (not sure if I've got it >>> > right). Below you may check the end of the strace (until it hangs). >>> > >>> > What may the problem be? The same happened twice in two different >>> > machines. I have several other VNC sessions where this does not happen. >>> > Can I collect information from anywhere else that could help me trying >>> > to figure out what happened? >>> > >>> > uname({sys="Linux", node="server03", ...}) = 0 >>> > socket(PF_FILE, SOCK_STREAM, 0) = 3 >>> > uname({sys="Linux", node="server03", ...}) = 0 >>> > uname({sys="Linux", node="server03", ...}) = 0 >>> > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 >>> > uname({sys="Linux", node="server03", ...}) = 0 >>> > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >>> > access("/u/cnhu/.Xauthority", R_OK) = 0 >>> > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >>> > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 >>> > mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >>> > 0) = 0x2ab9583b5000 >>> > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = >>> 13139 >>> > read(4, "", 32768) = 0 >>> > close(4) = 0 >>> > munmap(0x2ab9583b5000, 32768) = 0 >>> > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", >>> > 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = >>> 48 >>> > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >>> > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >>> > read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN (Resource >>> > temporarily unavailable) >>> > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> >>> >>> >>> ------------------------------------------------------------------------------ >>> October Webinars: Code for Performance >>> Free Intel webinars can help you accelerate application performance. >>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>> from >>> the latest Intel processors and coprocessors. See abstracts and register >>> > >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> VirtualGL-Users mailing list >>> Vir...@li... >>> https://lists.sourceforge.net/lists/listinfo/virtualgl-users >>> >> >> > |
From: DRC <dco...@us...> - 2013-10-24 19:38:00
|
Since Xvnc is single-threaded, if you are able to telnet in on 59xx, that means that the server is not locked up. That leaves the following as possibilities in my mind: -- Perhaps this is due to a problem in the X11 client library. If so, then there's probably nothing I can do about it. Try updating your system to the latest O/S patches? -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. They should be writable by whoever used is launching the VNC server. -- If all else fails, then perhaps you can work around it by disabling local X11 communications. You do this by starting Xvnc with '-nolisten local', which forces it to listen for X11 traffic on a TCP socket rather than a local socket, so it will not use /tmp/.X11-unix/X* in that case. Several things that I still don't understand about this situation: -- If you mentioned what O/S these machines are running, I missed it, so please provide that information. If it's an O/S I haven't tested, then I'm happy to try reproducing it on my end, but you implied that there was nothing different about the machines that were experiencing the bug vs. the machines that aren't (?) -- When the X server "locks up", does it remain locked? That is, do repeated attempts to run vncpasswd fail in the same way? -- Have you tried to assign a regular VNC password to the session? That would give you a way to log in and verify whether it is still properly accepting client connections once vncpasswd starts failing. I'm not saying that there's no possibility of a bug in TurboVNC, but we do have thousands of seats worldwide that are using portals built around the OTP functionality. These portals are dynamically generating OTPs every time a user launches or reconnects to a session, so it seems to me that if this was a bug in TurboVNC, someone else would have stumbled upon it by now. On 10/24/13 9:26 AM, Rafael Guimaraes wrote: > Hi DRC, > > Just some last information... I have tried to telnet Xvnc (in ports 5813 > and 5913, since I using display 13) in order to see if the Xvnc process > is completely locked or not and it seems to be responding to my > connections correclty. > > I have opened a telnet to port 5813, sent a "GET /VncViewer.jar" and was > able to download the applet. Then I opened a telnet to port 5913, > received "RFB 003.008", send "RFB 003.003" and got some binary info I > was not able to figure out... But it seems to be responding correctly, > it is not completely locked. I don't know what further tests should I do > for finding out what's happening. Since it has already happened twice, > in two different machines, I would like to know as much as I possible in > order to avoid it from happening again in the future... > > Cheers, > > Rafael > > > 2013/10/24 Rafael Guimaraes <rg...@gm... <mailto:rg...@gm...>> > > Hi DRC, > > I overcame my laziness and launched vncpasswd with gdb. The result > was pretty much the same of what strace and printfs have shown. The > problem really seems to be in the communication with the Xvnc server... > > Cheers, > > Rafael > > > > Reading symbols from /opt/TurboVNC/bin/vncpasswd...Reading symbols > from /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. > done. > (gdb) r -o -display :13 > Starting program: /opt/TurboVNC/bin/vncpasswd -o -display :13 > > Program received signal SIGINT, Interrupt. > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > (gdb) bt > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > #1 0x000000372124a9f0 in ?? () from /usr/lib64/libX11.so.6 > #2 0x000000372124ae19 in _XRead () from /usr/lib64/libX11.so.6 > #3 0x00000037212378c9 in XOpenDisplay () from /usr/lib64/libX11.so.6 > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 > #5 0x0000000000402187 in main (argc=4, argv=0x7fffffffe748) at > vncpasswd.c:293 > (gdb) > > > > 2013/10/24 Rafael Guimaraes <rg...@gm... <mailto:rg...@gm...>> > > Hi DRC, > > I have followed the best debugging practice there is (don't > worry, I'm being ironic :)), putting printfs around the possible > problem and I figured out that the vncpasswd hangs when calling > XOpenDisplay. When executing the code below, it prints "Opening > display :13" and nothing else. If I run it with strace, I get > the trace below. > > I didn't do the gdb test just because I am not very familiar > with doing it by gdb (I've already done it before, but there has > been centuries). If the following information is not enough and > gdb would help, I am sure I can handle it, no problem. I am just > trying to make thing easier for me! A little bit egocentric, I > know, but I just can't help it... :) > > If you check the strace output, it hangs on the last line, when > I seems to be trying to communicate through a socket > "/tmp/.X11-unix/X13". I imagine it is trying to reach Xvnc on > display 13 and it is getting no response, so it keeps waiting > forever... Am I right? What should I do if this is the case? > > Cheers, > > Rafael > > VNCPASSWD CODE WITH PRINTFS: > > int DoOTP() > { > unsigned int full; > unsigned int view = 0; > Display* dpy; > Atom prop; > int len; > char buf[MAXPWLEN + 1]; > char bytes[MAXPWLEN * 2]; > #ifdef UseDevUrandom > int fd; > #endif > > *printf("Opening display %s\n",displayname);* > if ((dpy = XOpenDisplay(displayname)) == NULL) { > fprintf(stderr, "unable to open display \"%s\"\n", > XDisplayName(displayname)); > return(1); > } > *printf("Display opened\n");* > > > STRACE: > > write(1, "Opening display :13\n", 20Opening display :13 > ) = 20 > brk(0) = 0x59f3000 > brk(0x5a14000) = 0x5a14000 > uname({sys="Linux", node="server03", ...}) = 0 > socket(PF_FILE, SOCK_STREAM, 0) = 3 > uname({sys="Linux", node="server03", ...}) = 0 > uname({sys="Linux", node="server03", ...}) = 0 > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, > 20) = 0 > uname({sys="Linux", node="server03", ...}) = 0 > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > access("/u/cnhu/.Xauthority", R_OK) = 0 > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., > 32768) = 13139 > read(4, "", 32768) = 0 > close(4) = 0 > munmap(0x2b34aa97f000, 32768) = 0 > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource > temporarily unavailable) > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > > > > > 2013/10/24 DRC <dco...@us... > <mailto:dco...@us...>> > > Can you figure out where in the vncpasswd code it's locking > up? That > would help diagnose the issue. If you're using an RPM-based > system, you > can install the turbovnc-debuginfo RPM and get a stack trace > from gdb > when it locks up. Otherwise, you'll have to build vncpasswd > from > source, but it's straightforward. > > This is a complete shot in the dark, but I'm wondering if > maybe the > /dev/urandom device is somehow giving you problems on some > of your > machines. vncpasswd will read from /dev/urandom when it > generates an > OTP. If that is the problem, then adding > > #undef UseDevUrandom > > to the top of vncpasswd.c should temporarily work around it, > and it > would be easy to add a more permanent workaround (a command > line switch > that avoids using /dev/urandom.) > > > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: > > Hi, > > > > I have created a VNC session and after using it for sometime, it seems > > to be "locked". It is configured to use one-time-passwords and I just > > can't generate a new otp for access the session, vncpasswd hangs > > indefinitely when trying to get a new password. > > > > The session was created by the following command line: > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd > > /opt/TurboVNC/bin/../vnc/classes -auth /u/cnhu/.Xauthority > > -dontdisconnect -geometry 3192x1046 -depth 24 -rfbwait 120000 -otpauth > > -rfbport 5913 -fp > > /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 > > -co /usr/share/X11/rgb -deferupdate 1 > > > > When trying to launch a "vncpasswd -otp -display :13", I get no response > > and eventually I have to press Ctrl+C. I have runned the vncpasswd > > command line with strace and it seems to hang while trying to connect to > > Xvnc through the "/tmp/.X11-unix/X13" socket (not sure if I've got it > > right). Below you may check the end of the strace (until it hangs). > > > > What may the problem be? The same happened twice in two different > > machines. I have several other VNC sessions where this does not happen. > > Can I collect information from anywhere else that could help me trying > > to figure out what happened? > > > > uname({sys="Linux", node="server03", ...}) = 0 > > socket(PF_FILE, SOCK_STREAM, 0) = 3 > > uname({sys="Linux", node="server03", ...}) = 0 > > uname({sys="Linux", node="server03", ...}) = 0 > > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, 20) = 0 > > uname({sys="Linux", node="server03", ...}) = 0 > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > > access("/u/cnhu/.Xauthority", R_OK) = 0 > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, > > 0) = 0x2ab9583b5000 > > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 > > read(4, "", 32768) = 0 > > close(4) = 0 > > munmap(0x2ab9583b5000, 32768) = 0 > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, {"MIT-MAGIC-COOKIE-1", > > 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > > read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN (Resource > > temporarily unavailable) > > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application > performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. > Get the most from > the latest Intel processors and coprocessors. See abstracts > and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > <mailto:Vir...@li...> > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > > > > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > > > > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > |
From: Rafael G. <rg...@gm...> - 2013-10-25 11:46:42
|
Replying what you asked: - I really didn't mention my OS. It is a CentOS 5.9 64 bits. It is pretty much updated. - Yes, when the X server locks up it remains locked. I have one Xvnc locked up right now and that's where I am performing all the tests. - Sockets permissions as well as its directories are ok, I have just checked it (they are all 777, with sticky bit for /tmp and /tmp/.X11-unix and setuid for the socket file) - When I run xwd on the same session, it also freezes. And it communicates with XVnc using a regular socket (through port 6013). This means that Xvnc is not responding the poll no matter how I communicate (through a local or a regular socket) and it also means that the problem is not restricted to generating OTP passwords. Based on that, I think that disabling local sockets won't help. - I have also been running a portal based on TurboVNC/VirtualGL for a couple of years and that's the first time that something similar happens. I don't know if this happens anyway, but I have runned strace on the problematic Xvnc process and it keeps doing just the same (looping over the same system calls shown below), even when I run a vncpasswd, while I can see a different behavior, i.e. normal processing, when I telnet to port 5913. select(128, [0 1 3 4 5 6 7 9 11 13 14 17 18 19 20 21 22 24 25 26 27], NULL, NULL, {593, 19000}) = -1 EBADF (Bad file descriptor) select(6, [5], NULL, NULL, {0, 0}) = 0 (Timeout) select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout) select(8, [7], NULL, NULL, {0, 0}) = 1 (in [7], left {0, 0}) select(12, [11], NULL, NULL, {0, 0}) = 0 (Timeout) select(14, [13], NULL, NULL, {0, 0}) = 1 (in [13], left {0, 0}) select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout) select(18, [17], NULL, NULL, {0, 0}) = 0 (Timeout) select(19, [18], NULL, NULL, {0, 0}) = 1 (in [18], left {0, 0}) select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout) select(21, [20], NULL, NULL, {0, 0}) = 0 (Timeout) select(22, [21], NULL, NULL, {0, 0}) = 0 (Timeout) select(23, [22], NULL, NULL, {0, 0}) = 0 (Timeout) select(25, [24], NULL, NULL, {0, 0}) = 0 (Timeout) select(26, [25], NULL, NULL, {0, 0}) = 1 (in [25], left {0, 0}) select(27, [26], NULL, NULL, {0, 0}) = 0 (Timeout) select(28, [27], NULL, NULL, {0, 0}) = 0 (Timeout) select(13, [3], NULL, NULL, {0, 0}) = 0 (Timeout) select(5, [4], NULL, NULL, {0, 0}) = 0 (Timeout) - I also tried to set up a regular VNC password for the session (I did it by running "vncpasswd -d :13", is it correct? At that time, I monitored Xvnc with strace and it did nothing different from the selects shown above). Then I tried to open the VncViewer passing the password through command line. I've got the following output: Initializing... Connecting to localhost, port 5913... Connected to server Tentando conectar em localhost na porta 5555 RFB server supports protocol version 3.8 Using RFB protocol version 3.8 Enabling TightVNC protocol extensions Performing standard VNC authentication Error: The one-time password has not been set on the server java.lang.Exception: The one-time password has not been set on the server at RfbProto.readConnFailedReason(RfbProto.java:574) at RfbProto.readSecurityResult(RfbProto.java:556) at RfbProto.authenticateVNC(RfbProto.java:475) at VncViewer.connectAndAuthenticate(VncViewer.java:370) at VncViewer.run(VncViewer.java:188) at java.lang.Thread.run(Unknown Source) RFB socket closed Closing window Disconnecting - And the following strace for Xvnc accept(3, {sa_family=AF_INET, sin_port=htons(57854), sin_addr=inet_addr("127.0.0.1")}, [54348846876065808]) = 8 fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 write(2, "\n", 1) = 1 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 write(2, "25/10/2013 09:38:41 ", 20) = 20 write(2, "Got connection from client 127.0"..., 37) = 37 getpeername(8, {sa_family=AF_INET, sin_port=htons(57854), sin_addr=inet_addr("127.0.0.1")}, [34359738384]) = 0 write(8, "RFB 003.008\n", 12) = 12 read(8, "RFB 003.008\n", 12) = 12 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 write(2, "25/10/2013 09:38:44 ", 20) = 20 write(2, "Using protocol version 3.8\n", 27) = 27 write(8, "\2\2\20", 3) = 3 read(8, "\20", 1) = 1 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 write(2, "25/10/2013 09:38:44 ", 20) = 20 write(2, "Enabling TightVNC protocol exten"..., 38) = 38 write(8, "\0\0\0\0", 4) = 4 write(8, "\0\0\0\1", 4) = 4 write(8, "\0\0\0\2STDVVNCAUTH_", 16) = 16 read(8, "\0\0\0\2", 4) = 4 write(8, "!\236\\U\252\30\25\356\265:\254\275$=\35s", 16) = 16 read(8, "J\221#\247]\376\274I\370C\232\314h\334;\263", 16) = 16 write(8, "\0\0\0\1\0\0\0004", 8) = 8 write(8, "The one-time password has not be"..., 52) = 52 close(8) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 write(2, "25/10/2013 09:38:44 ", 20) = 20 write(2, "Client 127.0.0.1 gone\n", 22) = 22 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 write(2, "25/10/2013 09:38:44 ", 20) = 20 write(2, "Statistics:\n", 12) = 12 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 write(2, "25/10/2013 09:38:44 ", 20) = 20 write(2, " framebuffer updates 0, rectang"..., 47) = 47 Any ideas? Best regards, Rafael 2013/10/24 DRC <dco...@us...> > Since Xvnc is single-threaded, if you are able to telnet in on 59xx, > that means that the server is not locked up. That leaves the following > as possibilities in my mind: > > -- Perhaps this is due to a problem in the X11 client library. If so, > then there's probably nothing I can do about it. Try updating your > system to the latest O/S patches? > > -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. They should > be writable by whoever used is launching the VNC server. > > -- If all else fails, then perhaps you can work around it by disabling > local X11 communications. You do this by starting Xvnc with '-nolisten > local', which forces it to listen for X11 traffic on a TCP socket rather > than a local socket, so it will not use /tmp/.X11-unix/X* in that case. > > Several things that I still don't understand about this situation: > > -- If you mentioned what O/S these machines are running, I missed it, so > please provide that information. If it's an O/S I haven't tested, then > I'm happy to try reproducing it on my end, but you implied that there > was nothing different about the machines that were experiencing the bug > vs. the machines that aren't (?) > > -- When the X server "locks up", does it remain locked? That is, do > repeated attempts to run vncpasswd fail in the same way? > > -- Have you tried to assign a regular VNC password to the session? That > would give you a way to log in and verify whether it is still properly > accepting client connections once vncpasswd starts failing. > > I'm not saying that there's no possibility of a bug in TurboVNC, but we > do have thousands of seats worldwide that are using portals built around > the OTP functionality. These portals are dynamically generating OTPs > every time a user launches or reconnects to a session, so it seems to me > that if this was a bug in TurboVNC, someone else would have stumbled > upon it by now. > > > On 10/24/13 9:26 AM, Rafael Guimaraes wrote: > > Hi DRC, > > > > Just some last information... I have tried to telnet Xvnc (in ports 5813 > > and 5913, since I using display 13) in order to see if the Xvnc process > > is completely locked or not and it seems to be responding to my > > connections correclty. > > > > I have opened a telnet to port 5813, sent a "GET /VncViewer.jar" and was > > able to download the applet. Then I opened a telnet to port 5913, > > received "RFB 003.008", send "RFB 003.003" and got some binary info I > > was not able to figure out... But it seems to be responding correctly, > > it is not completely locked. I don't know what further tests should I do > > for finding out what's happening. Since it has already happened twice, > > in two different machines, I would like to know as much as I possible in > > order to avoid it from happening again in the future... > > > > Cheers, > > > > Rafael > > > > > > 2013/10/24 Rafael Guimaraes <rg...@gm... <mailto:rg...@gm...>> > > > > Hi DRC, > > > > I overcame my laziness and launched vncpasswd with gdb. The result > > was pretty much the same of what strace and printfs have shown. The > > problem really seems to be in the communication with the Xvnc > server... > > > > Cheers, > > > > Rafael > > > > > > > > Reading symbols from /opt/TurboVNC/bin/vncpasswd...Reading symbols > > from /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. > > done. > > (gdb) r -o -display :13 > > Starting program: /opt/TurboVNC/bin/vncpasswd -o -display :13 > > > > Program received signal SIGINT, Interrupt. > > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > > (gdb) bt > > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > > #1 0x000000372124a9f0 in ?? () from /usr/lib64/libX11.so.6 > > #2 0x000000372124ae19 in _XRead () from /usr/lib64/libX11.so.6 > > #3 0x00000037212378c9 in XOpenDisplay () from /usr/lib64/libX11.so.6 > > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 > > #5 0x0000000000402187 in main (argc=4, argv=0x7fffffffe748) at > > vncpasswd.c:293 > > (gdb) > > > > > > > > 2013/10/24 Rafael Guimaraes <rg...@gm... <mailto: > rg...@gm...>> > > > > Hi DRC, > > > > I have followed the best debugging practice there is (don't > > worry, I'm being ironic :)), putting printfs around the possible > > problem and I figured out that the vncpasswd hangs when calling > > XOpenDisplay. When executing the code below, it prints "Opening > > display :13" and nothing else. If I run it with strace, I get > > the trace below. > > > > I didn't do the gdb test just because I am not very familiar > > with doing it by gdb (I've already done it before, but there has > > been centuries). If the following information is not enough and > > gdb would help, I am sure I can handle it, no problem. I am just > > trying to make thing easier for me! A little bit egocentric, I > > know, but I just can't help it... :) > > > > If you check the strace output, it hangs on the last line, when > > I seems to be trying to communicate through a socket > > "/tmp/.X11-unix/X13". I imagine it is trying to reach Xvnc on > > display 13 and it is getting no response, so it keeps waiting > > forever... Am I right? What should I do if this is the case? > > > > Cheers, > > > > Rafael > > > > VNCPASSWD CODE WITH PRINTFS: > > > > int DoOTP() > > { > > unsigned int full; > > unsigned int view = 0; > > Display* dpy; > > Atom prop; > > int len; > > char buf[MAXPWLEN + 1]; > > char bytes[MAXPWLEN * 2]; > > #ifdef UseDevUrandom > > int fd; > > #endif > > > > *printf("Opening display %s\n",displayname);* > > if ((dpy = XOpenDisplay(displayname)) == NULL) { > > fprintf(stderr, "unable to open display \"%s\"\n", > > XDisplayName(displayname)); > > return(1); > > } > > *printf("Display opened\n");* > > > > > > STRACE: > > > > write(1, "Opening display :13\n", 20Opening display :13 > > ) = 20 > > brk(0) = 0x59f3000 > > brk(0x5a14000) = 0x5a14000 > > uname({sys="Linux", node="server03", ...}) = 0 > > socket(PF_FILE, SOCK_STREAM, 0) = 3 > > uname({sys="Linux", node="server03", ...}) = 0 > > uname({sys="Linux", node="server03", ...}) = 0 > > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, > > 20) = 0 > > uname({sys="Linux", node="server03", ...}) = 0 > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > > access("/u/cnhu/.Xauthority", R_OK) = 0 > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 > > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., > > 32768) = 13139 > > read(4, "", 32768) = 0 > > close(4) = 0 > > munmap(0x2b34aa97f000, 32768) = 0 > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, > > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource > > temporarily unavailable) > > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > > > > > > > > > > 2013/10/24 DRC <dco...@us... > > <mailto:dco...@us...>> > > > > Can you figure out where in the vncpasswd code it's locking > > up? That > > would help diagnose the issue. If you're using an RPM-based > > system, you > > can install the turbovnc-debuginfo RPM and get a stack trace > > from gdb > > when it locks up. Otherwise, you'll have to build vncpasswd > > from > > source, but it's straightforward. > > > > This is a complete shot in the dark, but I'm wondering if > > maybe the > > /dev/urandom device is somehow giving you problems on some > > of your > > machines. vncpasswd will read from /dev/urandom when it > > generates an > > OTP. If that is the problem, then adding > > > > #undef UseDevUrandom > > > > to the top of vncpasswd.c should temporarily work around it, > > and it > > would be easy to add a more permanent workaround (a command > > line switch > > that avoids using /dev/urandom.) > > > > > > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: > > > Hi, > > > > > > I have created a VNC session and after using it for > sometime, it seems > > > to be "locked". It is configured to use one-time-passwords > and I just > > > can't generate a new otp for access the session, vncpasswd > hangs > > > indefinitely when trying to get a new password. > > > > > > The session was created by the following command line: > > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd > > > /opt/TurboVNC/bin/../vnc/classes -auth /u/cnhu/.Xauthority > > > -dontdisconnect -geometry 3192x1046 -depth 24 -rfbwait > 120000 -otpauth > > > -rfbport 5913 -fp > > > > /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 > > > -co /usr/share/X11/rgb -deferupdate 1 > > > > > > When trying to launch a "vncpasswd -otp -display :13", I > get no response > > > and eventually I have to press Ctrl+C. I have runned the > vncpasswd > > > command line with strace and it seems to hang while trying > to connect to > > > Xvnc through the "/tmp/.X11-unix/X13" socket (not sure if > I've got it > > > right). Below you may check the end of the strace (until > it hangs). > > > > > > What may the problem be? The same happened twice in two > different > > > machines. I have several other VNC sessions where this > does not happen. > > > Can I collect information from anywhere else that could > help me trying > > > to figure out what happened? > > > > > > uname({sys="Linux", node="server03", ...}) = 0 > > > socket(PF_FILE, SOCK_STREAM, 0) = 3 > > > uname({sys="Linux", node="server03", ...}) = 0 > > > uname({sys="Linux", node="server03", ...}) = 0 > > > connect(3, {sa_family=AF_FILE, > path="/tmp/.X11-unix/X13"...}, 20) = 0 > > > uname({sys="Linux", node="server03", ...}) = 0 > > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > > > access("/u/cnhu/.Xauthority", R_OK) = 0 > > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 > > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, > > > 0) = 0x2ab9583b5000 > > > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., > 32768) = 13139 > > > read(4, "", 32768) = 0 > > > close(4) = 0 > > > munmap(0x2ab9583b5000, 32768) = 0 > > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > {"MIT-MAGIC-COOKIE-1", > > > 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", > 16}], 4) = 48 > > > fcntl(3, F_GETFL) = 0x2 (flags > O_RDWR) > > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > > > read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN > (Resource > > > temporarily unavailable) > > > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > > > > > ------------------------------------------------------------------------------ > > October Webinars: Code for Performance > > Free Intel webinars can help you accelerate application > > performance. > > Explore tips for MPI, OpenMP, advanced profiling, and more. > > Get the most from > > the latest Intel processors and coprocessors. See abstracts > > and register > > > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > > _______________________________________________ > > VirtualGL-Users mailing list > > Vir...@li... > > <mailto:Vir...@li...> > > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > October Webinars: Code for Performance > > Free Intel webinars can help you accelerate application performance. > > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > > the latest Intel processors and coprocessors. See abstracts and register > > > > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > > > > > > > > _______________________________________________ > > VirtualGL-Users mailing list > > Vir...@li... > > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > |
From: DRC <dco...@us...> - 2013-10-25 15:30:41
|
Really not sure what's going on here. I am running CentOS 5.9 as well, and I launched Xvnc using the exact same command line that you used below. I set up a script that generates 10,000 OTPs in rapid succession, then I ran 10 copies of it in parallel, and all of them ran to completion with no errors or lockups. I tried again using a script that creates 1000 OTPs in rapid succession, and I ran 100 copies of it. This was using the latest stable build of 1.2.x available at http://virtualgl.sourceforge.net/vnc.nightly/, but nothing has changed since 1.2 that would have affected this, so I would expect identical behavior from 1.2. The fact that you have been running for years with no errors really seems to indicate that something has changed at the system level. We have not modified the way OTPs are generated, and in fact, I actually don't think that feature has been touched since it was first introduced in 1.0. One comment I will make is that you aren't setting up the VNC password correctly below. You have to pass an argument of -rfbauth to Xvnc in order to set it up with a "normal" VNC password. Rather than an strace output, it would be more useful to know where in the Xvnc code the "infinite loop" is occurring. You should be able to get that info with gdb. In the past, there have been several cases in which I was able to diagnose an Xvnc error by simply looking at the code, without even having to reproduce it. On 10/25/13 6:46 AM, Rafael Guimaraes wrote: > Replying what you asked: > > - I really didn't mention my OS. It is a CentOS 5.9 64 bits. It is > pretty much updated. > - Yes, when the X server locks up it remains locked. I have one Xvnc > locked up right now and that's where I am performing all the tests. > - Sockets permissions as well as its directories are ok, I have just > checked it (they are all 777, with sticky bit for /tmp and > /tmp/.X11-unix and setuid for the socket file) > - When I run xwd on the same session, it also freezes. And it > communicates with XVnc using a regular socket (through port 6013). > This means that Xvnc is not responding the poll no matter how I > communicate (through a local or a regular socket) and it also means > that the problem is not restricted to generating OTP passwords. Based > on that, I think that disabling local sockets won't help. > - I have also been running a portal based on TurboVNC/VirtualGL for a > couple of years and that's the first time that something similar happens. > > I don't know if this happens anyway, but I have runned strace on the > problematic Xvnc process and it keeps doing just the same (looping > over the same system calls shown below), even when I run a vncpasswd, > while I can see a different behavior, i.e. normal processing, when I > telnet to port 5913. > > select(128, [0 1 3 4 5 6 7 9 11 13 14 17 18 19 20 21 22 24 25 26 27], > NULL, NULL, {593, 19000}) = -1 EBADF (Bad file descriptor) > select(6, [5], NULL, NULL, {0, 0}) = 0 (Timeout) > select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout) > select(8, [7], NULL, NULL, {0, 0}) = 1 (in [7], left {0, 0}) > select(12, [11], NULL, NULL, {0, 0}) = 0 (Timeout) > select(14, [13], NULL, NULL, {0, 0}) = 1 (in [13], left {0, 0}) > select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout) > select(18, [17], NULL, NULL, {0, 0}) = 0 (Timeout) > select(19, [18], NULL, NULL, {0, 0}) = 1 (in [18], left {0, 0}) > select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout) > select(21, [20], NULL, NULL, {0, 0}) = 0 (Timeout) > select(22, [21], NULL, NULL, {0, 0}) = 0 (Timeout) > select(23, [22], NULL, NULL, {0, 0}) = 0 (Timeout) > select(25, [24], NULL, NULL, {0, 0}) = 0 (Timeout) > select(26, [25], NULL, NULL, {0, 0}) = 1 (in [25], left {0, 0}) > select(27, [26], NULL, NULL, {0, 0}) = 0 (Timeout) > select(28, [27], NULL, NULL, {0, 0}) = 0 (Timeout) > select(13, [3], NULL, NULL, {0, 0}) = 0 (Timeout) > select(5, [4], NULL, NULL, {0, 0}) = 0 (Timeout) > > - I also tried to set up a regular VNC password for the session (I did > it by running "vncpasswd -d :13", is it correct? At that time, I > monitored Xvnc with strace and it did nothing different from the > selects shown above). Then I tried to open the VncViewer passing the > password through command line. I've got the following output: > > Initializing... > Connecting to localhost, port 5913... > Connected to server > Tentando conectar em localhost na porta 5555 > RFB server supports protocol version 3.8 > Using RFB protocol version 3.8 > Enabling TightVNC protocol extensions > Performing standard VNC authentication > Error: The one-time password has not been set on the server > java.lang.Exception: The one-time password has not been set on the server > at RfbProto.readConnFailedReason(RfbProto.java:574) > at RfbProto.readSecurityResult(RfbProto.java:556) > at RfbProto.authenticateVNC(RfbProto.java:475) > at VncViewer.connectAndAuthenticate(VncViewer.java:370) > at VncViewer.run(VncViewer.java:188) > at java.lang.Thread.run(Unknown Source) > RFB socket closed > Closing window > Disconnecting > > - And the following strace for Xvnc > > accept(3, {sa_family=AF_INET, sin_port=htons(57854), > sin_addr=inet_addr("127.0.0.1")}, [54348846876065808]) = 8 > fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 > setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 > write(2, "\n", 1) = 1 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:41 ", 20) = 20 > write(2, "Got connection from client 127.0"..., 37) = 37 > getpeername(8, {sa_family=AF_INET, sin_port=htons(57854), > sin_addr=inet_addr("127.0.0.1")}, [34359738384]) = 0 > write(8, "RFB 003.008\n", 12) = 12 > read(8, "RFB 003.008\n", 12) = 12 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, "Using protocol version 3.8\n", 27) = 27 > write(8, "\2\2\20", 3) = 3 > read(8, "\20", 1) = 1 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, "Enabling TightVNC protocol exten"..., 38) = 38 > write(8, "\0\0\0\0", 4) = 4 > write(8, "\0\0\0\1", 4) = 4 > write(8, "\0\0\0\2STDVVNCAUTH_", 16) = 16 > read(8, "\0\0\0\2", 4) = 4 > write(8, "!\236\\U\252\30\25\356\265:\254\275$=\35s", 16) = 16 > read(8, "J\221#\247]\376\274I\370C\232\314h\334;\263", 16) = 16 > write(8, "\0\0\0\1\0\0\0004", 8) = 8 > write(8, "The one-time password has not be"..., 52) = 52 > close(8) = 0 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, "Client 127.0.0.1 gone\n", 22) = 22 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, "Statistics:\n", 12) = 12 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, " framebuffer updates 0, rectang"..., 47) = 47 > > Any ideas? > > Best regards, > > Rafael > > > 2013/10/24 DRC <dco...@us... > <mailto:dco...@us...>> > > Since Xvnc is single-threaded, if you are able to telnet in on 59xx, > that means that the server is not locked up. That leaves the > following > as possibilities in my mind: > > -- Perhaps this is due to a problem in the X11 client library. If so, > then there's probably nothing I can do about it. Try updating your > system to the latest O/S patches? > > -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. They should > be writable by whoever used is launching the VNC server. > > -- If all else fails, then perhaps you can work around it by disabling > local X11 communications. You do this by starting Xvnc with > '-nolisten > local', which forces it to listen for X11 traffic on a TCP socket > rather > than a local socket, so it will not use /tmp/.X11-unix/X* in that > case. > > Several things that I still don't understand about this situation: > > -- If you mentioned what O/S these machines are running, I missed > it, so > please provide that information. If it's an O/S I haven't tested, > then > I'm happy to try reproducing it on my end, but you implied that there > was nothing different about the machines that were experiencing > the bug > vs. the machines that aren't (?) > > -- When the X server "locks up", does it remain locked? That is, do > repeated attempts to run vncpasswd fail in the same way? > > -- Have you tried to assign a regular VNC password to the session? > That > would give you a way to log in and verify whether it is still properly > accepting client connections once vncpasswd starts failing. > > I'm not saying that there's no possibility of a bug in TurboVNC, > but we > do have thousands of seats worldwide that are using portals built > around > the OTP functionality. These portals are dynamically generating OTPs > every time a user launches or reconnects to a session, so it seems > to me > that if this was a bug in TurboVNC, someone else would have stumbled > upon it by now. > > > On 10/24/13 9:26 AM, Rafael Guimaraes wrote: > > Hi DRC, > > > > Just some last information... I have tried to telnet Xvnc (in > ports 5813 > > and 5913, since I using display 13) in order to see if the Xvnc > process > > is completely locked or not and it seems to be responding to my > > connections correclty. > > > > I have opened a telnet to port 5813, sent a "GET /VncViewer.jar" > and was > > able to download the applet. Then I opened a telnet to port 5913, > > received "RFB 003.008", send "RFB 003.003" and got some binary > info I > > was not able to figure out... But it seems to be responding > correctly, > > it is not completely locked. I don't know what further tests > should I do > > for finding out what's happening. Since it has already happened > twice, > > in two different machines, I would like to know as much as I > possible in > > order to avoid it from happening again in the future... > > > > Cheers, > > > > Rafael > > > > > > 2013/10/24 Rafael Guimaraes <rg...@gm... > <mailto:rg...@gm...> <mailto:rg...@gm... > <mailto:rg...@gm...>>> > > > > Hi DRC, > > > > I overcame my laziness and launched vncpasswd with gdb. The > result > > was pretty much the same of what strace and printfs have > shown. The > > problem really seems to be in the communication with the > Xvnc server... > > > > Cheers, > > > > Rafael > > > > > > > > Reading symbols from /opt/TurboVNC/bin/vncpasswd...Reading > symbols > > from /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. > > done. > > (gdb) r -o -display :13 > > Starting program: /opt/TurboVNC/bin/vncpasswd -o -display :13 > > > > Program received signal SIGINT, Interrupt. > > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > > (gdb) bt > > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > > #1 0x000000372124a9f0 in ?? () from /usr/lib64/libX11.so.6 > > #2 0x000000372124ae19 in _XRead () from /usr/lib64/libX11.so.6 > > #3 0x00000037212378c9 in XOpenDisplay () from > /usr/lib64/libX11.so.6 > > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 > > #5 0x0000000000402187 in main (argc=4, argv=0x7fffffffe748) at > > vncpasswd.c:293 > > (gdb) > > > > > > > > 2013/10/24 Rafael Guimaraes <rg...@gm... > <mailto:rg...@gm...> <mailto:rg...@gm... > <mailto:rg...@gm...>>> > > > > Hi DRC, > > > > I have followed the best debugging practice there is (don't > > worry, I'm being ironic :)), putting printfs around the > possible > > problem and I figured out that the vncpasswd hangs when > calling > > XOpenDisplay. When executing the code below, it prints > "Opening > > display :13" and nothing else. If I run it with strace, > I get > > the trace below. > > > > I didn't do the gdb test just because I am not very familiar > > with doing it by gdb (I've already done it before, but > there has > > been centuries). If the following information is not > enough and > > gdb would help, I am sure I can handle it, no problem. I > am just > > trying to make thing easier for me! A little bit > egocentric, I > > know, but I just can't help it... :) > > > > If you check the strace output, it hangs on the last > line, when > > I seems to be trying to communicate through a socket > > "/tmp/.X11-unix/X13". I imagine it is trying to reach > Xvnc on > > display 13 and it is getting no response, so it keeps > waiting > > forever... Am I right? What should I do if this is the case? > > > > Cheers, > > > > Rafael > > > > VNCPASSWD CODE WITH PRINTFS: > > > > int DoOTP() > > { > > unsigned int full; > > unsigned int view = 0; > > Display* dpy; > > Atom prop; > > int len; > > char buf[MAXPWLEN + 1]; > > char bytes[MAXPWLEN * 2]; > > #ifdef UseDevUrandom > > int fd; > > #endif > > > > *printf("Opening display %s\n",displayname);* > > if ((dpy = XOpenDisplay(displayname)) == NULL) { > > fprintf(stderr, "unable to open display \"%s\"\n", > > XDisplayName(displayname)); > > return(1); > > } > > *printf("Display opened\n");* > > > > > > STRACE: > > > > write(1, "Opening display :13\n", 20Opening display :13 > > ) = 20 > > brk(0) = 0x59f3000 > > brk(0x5a14000) = 0x5a14000 > > uname({sys="Linux", node="server03", ...}) = 0 > > socket(PF_FILE, SOCK_STREAM, 0) = 3 > > uname({sys="Linux", node="server03", ...}) = 0 > > uname({sys="Linux", node="server03", ...}) = 0 > > connect(3, {sa_family=AF_FILE, > path="/tmp/.X11-unix/X13"...}, > > 20) = 0 > > uname({sys="Linux", node="server03", ...}) = 0 > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > > access("/u/cnhu/.Xauthority", R_OK) = 0 > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 > > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., > > 32768) = 13139 > > read(4, "", 32768) = 0 > > close(4) = 0 > > munmap(0x2b34aa97f000, 32768) = 0 > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, > > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource > > temporarily unavailable) > > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > > > > > > > > > > 2013/10/24 DRC <dco...@us... > <mailto:dco...@us...> > > <mailto:dco...@us... > <mailto:dco...@us...>>> > > > > Can you figure out where in the vncpasswd code it's > locking > > up? That > > would help diagnose the issue. If you're using an > RPM-based > > system, you > > can install the turbovnc-debuginfo RPM and get a > stack trace > > from gdb > > when it locks up. Otherwise, you'll have to build > vncpasswd > > from > > source, but it's straightforward. > > > > This is a complete shot in the dark, but I'm > wondering if > > maybe the > > /dev/urandom device is somehow giving you problems > on some > > of your > > machines. vncpasswd will read from /dev/urandom when it > > generates an > > OTP. If that is the problem, then adding > > > > #undef UseDevUrandom > > > > to the top of vncpasswd.c should temporarily work > around it, > > and it > > would be easy to add a more permanent workaround (a > command > > line switch > > that avoids using /dev/urandom.) > > > > > > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: > > > Hi, > > > > > > I have created a VNC session and after using it > for sometime, it seems > > > to be "locked". It is configured to use > one-time-passwords and I just > > > can't generate a new otp for access the session, > vncpasswd hangs > > > indefinitely when trying to get a new password. > > > > > > The session was created by the following command line: > > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd > > > /opt/TurboVNC/bin/../vnc/classes -auth > /u/cnhu/.Xauthority > > > -dontdisconnect -geometry 3192x1046 -depth 24 > -rfbwait 120000 -otpauth > > > -rfbport 5913 -fp > > > > /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 > > > -co /usr/share/X11/rgb -deferupdate 1 > > > > > > When trying to launch a "vncpasswd -otp -display > :13", I get no response > > > and eventually I have to press Ctrl+C. I have > runned the vncpasswd > > > command line with strace and it seems to hang > while trying to connect to > > > Xvnc through the "/tmp/.X11-unix/X13" socket (not > sure if I've got it > > > right). Below you may check the end of the strace > (until it hangs). > > > > > > What may the problem be? The same happened twice > in two different > > > machines. I have several other VNC sessions where > this does not happen. > > > Can I collect information from anywhere else that > could help me trying > > > to figure out what happened? > > > > > > uname({sys="Linux", node="server03", ...}) = 0 > > > socket(PF_FILE, SOCK_STREAM, 0) = 3 > > > uname({sys="Linux", node="server03", ...}) = 0 > > > uname({sys="Linux", node="server03", ...}) = 0 > > > connect(3, {sa_family=AF_FILE, > path="/tmp/.X11-unix/X13"...}, 20) = 0 > > > uname({sys="Linux", node="server03", ...}) = 0 > > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > > > access("/u/cnhu/.Xauthority", R_OK) = 0 > > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, > ...}) = 0 > > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_ANONYMOUS, -1, > > > 0) = 0x2ab9583b5000 > > > read(4, > "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 > > > read(4, "", 32768) = 0 > > > close(4) = 0 > > > munmap(0x2ab9583b5000, 32768) = 0 > > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > {"MIT-MAGIC-COOKIE-1", > > > 18}, {"\0\0", 2}, > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > > > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > > > read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN > (Resource > > > temporarily unavailable) > > > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > > > > > ------------------------------------------------------------------------------ > > October Webinars: Code for Performance > > Free Intel webinars can help you accelerate application > > performance. > > Explore tips for MPI, OpenMP, advanced profiling, > and more. > > Get the most from > > the latest Intel processors and coprocessors. See > abstracts > > and register > > > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > > _______________________________________________ > > VirtualGL-Users mailing list > > Vir...@li... > <mailto:Vir...@li...> > > <mailto:Vir...@li... > <mailto:Vir...@li...>> > > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > October Webinars: Code for Performance > > Free Intel webinars can help you accelerate application performance. > > Explore tips for MPI, OpenMP, advanced profiling, and more. Get > the most from > > the latest Intel processors and coprocessors. See abstracts and > register > > > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > > > > > > > > _______________________________________________ > > VirtualGL-Users mailing list > > Vir...@li... > <mailto:Vir...@li...> > > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get > the most from > the latest Intel processors and coprocessors. See abstracts and > register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > <mailto:Vir...@li...> > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > > > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users |
From: Rafael G. <rg...@gm...> - 2013-10-25 17:55:39
|
In fact, I think that it may be difficult for you to replicate the problem. I am not able to do it either... It happened twice with the same user and he couldn't explain me what caused the locking. He was using some software from Landmark (OpenWorks), something he does on a daily basis and it the TurboVNC session closed. After that the Xvnc is locked as I reported. There is also one issue that I though I have mentioned, but I was revising my emails and I haven't... I am using TurboVNC 1.1, because of the bug I previously reported (problems when 2 mouse buttons are pressed simultaneously). Anyway, gdb told me the following: #0 0x000000371f6ce3b3 in __select_nocancel () from /lib64/libc.so.6 #1 0x000000000045883f in CheckConnections () at connection.c:997 #2 0x0000000000456ba7 in WaitForSomething (pClientsReady=0xed922c0) at WaitFor.c:356 #3 0x0000000000442b19 in Dispatch () at dispatch.c:259 #4 0x00000000004270fd in main (argc=24, argv=0x7fff92910dd8) at main.c:400 It keeps on this select even when I run vncpasswd. I will try to find if some specific function of OpenWorks is causing Xvnc to hang so that I can have more information about the problem. For the moment, since XOpenDisplay may hang, I have implemented a hack in vncpasswd code (v. 1.2, since I tried it with Xvnc 1.1) that I would like to share, if you think that this may be useful in order to be more fault tolerant (it waits for 10 seconds for XOpenDisplay return, otherwise it exits). --- vncpasswd.c 2012-09-30 21:06:15.000000000 -0300 +++ /tmp/vncpasswd.c 2013-10-25 15:50:28.690931098 -0200 @@ -37,7 +37,8 @@ #include <sys/types.h> #include <unistd.h> #include <errno.h> #include "vncauth.h" +#include <signal.h> #include <X11/Xlib.h> #include <X11/Xatom.h> @@ -64,6 +65,20 @@ int otpClear; char* displayname; +void displaytimeout(int); +int displayreply; +#define DISPLAYTIMEOUT 10 + + +void displaytimeout(int signum) +{ + if (!displayreply) { + fprintf(stderr, "unable to communicate to display \"%s\"\n", + XDisplayName(displayname)); + exit(1); + } +} + int DoOTP() { @@ -78,19 +93,21 @@ int fd; #endif + signal(SIGALRM, displaytimeout); + alarm(DISPLAYTIMEOUT); + displayreply=0; if ((dpy = XOpenDisplay(displayname)) == NULL) { fprintf(stderr, "unable to open display \"%s\"\n", XDisplayName(displayname)); return(1); } - + displayreply=1; prop = XInternAtom(dpy, "VNC_OTP", True); if (prop == None) { fprintf(stderr, "The X display \"%s\" does not support VNC one-time passwords\n", XDisplayName(displayname)); return(1); } - if (otpClear) { len = 0; Cheers, Rafael 2013/10/25 DRC <dco...@us...> > Really not sure what's going on here. I am running CentOS 5.9 as well, > and I launched Xvnc using the exact same command line that you used below. > I set up a script that generates 10,000 OTPs in rapid succession, then I > ran 10 copies of it in parallel, and all of them ran to completion with no > errors or lockups. I tried again using a script that creates 1000 OTPs in > rapid succession, and I ran 100 copies of it. This was using the latest > stable build of 1.2.x available at > http://virtualgl.sourceforge.net/vnc.nightly/, but nothing has changed > since 1.2 that would have affected this, so I would expect identical > behavior from 1.2. > > The fact that you have been running for years with no errors really seems > to indicate that something has changed at the system level. We have not > modified the way OTPs are generated, and in fact, I actually don't think > that feature has been touched since it was first introduced in 1.0. > > One comment I will make is that you aren't setting up the VNC password > correctly below. You have to pass an argument of -rfbauth to Xvnc in order > to set it up with a "normal" VNC password. > > Rather than an strace output, it would be more useful to know where in the > Xvnc code the "infinite loop" is occurring. You should be able to get that > info with gdb. In the past, there have been several cases in which I was > able to diagnose an Xvnc error by simply looking at the code, without even > having to reproduce it. > > > On 10/25/13 6:46 AM, Rafael Guimaraes wrote: > > Replying what you asked: > > - I really didn't mention my OS. It is a CentOS 5.9 64 bits. It is pretty > much updated. > - Yes, when the X server locks up it remains locked. I have one Xvnc > locked up right now and that's where I am performing all the tests. > - Sockets permissions as well as its directories are ok, I have just > checked it (they are all 777, with sticky bit for /tmp and /tmp/.X11-unix > and setuid for the socket file) > - When I run xwd on the same session, it also freezes. And it communicates > with XVnc using a regular socket (through port 6013). This means that Xvnc > is not responding the poll no matter how I communicate (through a local or > a regular socket) and it also means that the problem is not restricted to > generating OTP passwords. Based on that, I think that disabling local > sockets won't help. > - I have also been running a portal based on TurboVNC/VirtualGL for a > couple of years and that's the first time that something similar happens. > > I don't know if this happens anyway, but I have runned strace on the > problematic Xvnc process and it keeps doing just the same (looping over the > same system calls shown below), even when I run a vncpasswd, while I can > see a different behavior, i.e. normal processing, when I telnet to port > 5913. > > select(128, [0 1 3 4 5 6 7 9 11 13 14 17 18 19 20 21 22 24 25 26 27], > NULL, NULL, {593, 19000}) = -1 EBADF (Bad file descriptor) > select(6, [5], NULL, NULL, {0, 0}) = 0 (Timeout) > select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout) > select(8, [7], NULL, NULL, {0, 0}) = 1 (in [7], left {0, 0}) > select(12, [11], NULL, NULL, {0, 0}) = 0 (Timeout) > select(14, [13], NULL, NULL, {0, 0}) = 1 (in [13], left {0, 0}) > select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout) > select(18, [17], NULL, NULL, {0, 0}) = 0 (Timeout) > select(19, [18], NULL, NULL, {0, 0}) = 1 (in [18], left {0, 0}) > select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout) > select(21, [20], NULL, NULL, {0, 0}) = 0 (Timeout) > select(22, [21], NULL, NULL, {0, 0}) = 0 (Timeout) > select(23, [22], NULL, NULL, {0, 0}) = 0 (Timeout) > select(25, [24], NULL, NULL, {0, 0}) = 0 (Timeout) > select(26, [25], NULL, NULL, {0, 0}) = 1 (in [25], left {0, 0}) > select(27, [26], NULL, NULL, {0, 0}) = 0 (Timeout) > select(28, [27], NULL, NULL, {0, 0}) = 0 (Timeout) > select(13, [3], NULL, NULL, {0, 0}) = 0 (Timeout) > select(5, [4], NULL, NULL, {0, 0}) = 0 (Timeout) > > - I also tried to set up a regular VNC password for the session (I did > it by running "vncpasswd -d :13", is it correct? At that time, I monitored > Xvnc with strace and it did nothing different from the selects shown > above). Then I tried to open the VncViewer passing the password through > command line. I've got the following output: > > Initializing... > Connecting to localhost, port 5913... > Connected to server > Tentando conectar em localhost na porta 5555 > RFB server supports protocol version 3.8 > Using RFB protocol version 3.8 > Enabling TightVNC protocol extensions > Performing standard VNC authentication > Error: The one-time password has not been set on the server > java.lang.Exception: The one-time password has not been set on the server > at RfbProto.readConnFailedReason(RfbProto.java:574) > at RfbProto.readSecurityResult(RfbProto.java:556) > at RfbProto.authenticateVNC(RfbProto.java:475) > at VncViewer.connectAndAuthenticate(VncViewer.java:370) > at VncViewer.run(VncViewer.java:188) > at java.lang.Thread.run(Unknown Source) > RFB socket closed > Closing window > Disconnecting > > - And the following strace for Xvnc > > accept(3, {sa_family=AF_INET, sin_port=htons(57854), > sin_addr=inet_addr("127.0.0.1")}, [54348846876065808]) = 8 > fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 > setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 > write(2, "\n", 1) = 1 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:41 ", 20) = 20 > write(2, "Got connection from client 127.0"..., 37) = 37 > getpeername(8, {sa_family=AF_INET, sin_port=htons(57854), > sin_addr=inet_addr("127.0.0.1")}, [34359738384]) = 0 > write(8, "RFB 003.008\n", 12) = 12 > read(8, "RFB 003.008\n", 12) = 12 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, "Using protocol version 3.8\n", 27) = 27 > write(8, "\2\2\20", 3) = 3 > read(8, "\20", 1) = 1 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, "Enabling TightVNC protocol exten"..., 38) = 38 > write(8, "\0\0\0\0", 4) = 4 > write(8, "\0\0\0\1", 4) = 4 > write(8, "\0\0\0\2STDVVNCAUTH_", 16) = 16 > read(8, "\0\0\0\2", 4) = 4 > write(8, "!\236\\U\252\30\25\356\265:\254\275$=\35s", 16) = 16 > read(8, "J\221#\247]\376\274I\370C\232\314h\334;\263", 16) = 16 > write(8, "\0\0\0\1\0\0\0004", 8) = 8 > write(8, "The one-time password has not be"..., 52) = 52 > close(8) = 0 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, "Client 127.0.0.1 gone\n", 22) = 22 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, "Statistics:\n", 12) = 12 > stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 > write(2, "25/10/2013 09:38:44 ", 20) = 20 > write(2, " framebuffer updates 0, rectang"..., 47) = 47 > > Any ideas? > > Best regards, > > Rafael > > > 2013/10/24 DRC <dco...@us...> > >> Since Xvnc is single-threaded, if you are able to telnet in on 59xx, >> that means that the server is not locked up. That leaves the following >> as possibilities in my mind: >> >> -- Perhaps this is due to a problem in the X11 client library. If so, >> then there's probably nothing I can do about it. Try updating your >> system to the latest O/S patches? >> >> -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. They should >> be writable by whoever used is launching the VNC server. >> >> -- If all else fails, then perhaps you can work around it by disabling >> local X11 communications. You do this by starting Xvnc with '-nolisten >> local', which forces it to listen for X11 traffic on a TCP socket rather >> than a local socket, so it will not use /tmp/.X11-unix/X* in that case. >> >> Several things that I still don't understand about this situation: >> >> -- If you mentioned what O/S these machines are running, I missed it, so >> please provide that information. If it's an O/S I haven't tested, then >> I'm happy to try reproducing it on my end, but you implied that there >> was nothing different about the machines that were experiencing the bug >> vs. the machines that aren't (?) >> >> -- When the X server "locks up", does it remain locked? That is, do >> repeated attempts to run vncpasswd fail in the same way? >> >> -- Have you tried to assign a regular VNC password to the session? That >> would give you a way to log in and verify whether it is still properly >> accepting client connections once vncpasswd starts failing. >> >> I'm not saying that there's no possibility of a bug in TurboVNC, but we >> do have thousands of seats worldwide that are using portals built around >> the OTP functionality. These portals are dynamically generating OTPs >> every time a user launches or reconnects to a session, so it seems to me >> that if this was a bug in TurboVNC, someone else would have stumbled >> upon it by now. >> >> >> On 10/24/13 9:26 AM, Rafael Guimaraes wrote: >> > Hi DRC, >> > >> > Just some last information... I have tried to telnet Xvnc (in ports 5813 >> > and 5913, since I using display 13) in order to see if the Xvnc process >> > is completely locked or not and it seems to be responding to my >> > connections correclty. >> > >> > I have opened a telnet to port 5813, sent a "GET /VncViewer.jar" and was >> > able to download the applet. Then I opened a telnet to port 5913, >> > received "RFB 003.008", send "RFB 003.003" and got some binary info I >> > was not able to figure out... But it seems to be responding correctly, >> > it is not completely locked. I don't know what further tests should I do >> > for finding out what's happening. Since it has already happened twice, >> > in two different machines, I would like to know as much as I possible in >> > order to avoid it from happening again in the future... >> > >> > Cheers, >> > >> > Rafael >> > >> > >> > 2013/10/24 Rafael Guimaraes <rg...@gm... <mailto:rg...@gm... >> >> >> > >> > Hi DRC, >> > >> > I overcame my laziness and launched vncpasswd with gdb. The result >> > was pretty much the same of what strace and printfs have shown. The >> > problem really seems to be in the communication with the Xvnc >> server... >> > >> > Cheers, >> > >> > Rafael >> > >> > >> > >> > Reading symbols from /opt/TurboVNC/bin/vncpasswd...Reading symbols >> > from /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. >> > done. >> > (gdb) r -o -display :13 >> > Starting program: /opt/TurboVNC/bin/vncpasswd -o -display :13 >> > >> > Program received signal SIGINT, Interrupt. >> > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 >> > (gdb) bt >> > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 >> > #1 0x000000372124a9f0 in ?? () from /usr/lib64/libX11.so.6 >> > #2 0x000000372124ae19 in _XRead () from /usr/lib64/libX11.so.6 >> > #3 0x00000037212378c9 in XOpenDisplay () from >> /usr/lib64/libX11.so.6 >> > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 >> > #5 0x0000000000402187 in main (argc=4, argv=0x7fffffffe748) at >> > vncpasswd.c:293 >> > (gdb) >> > >> > >> > >> > 2013/10/24 Rafael Guimaraes <rg...@gm... <mailto: >> rg...@gm...>> >> > >> > Hi DRC, >> > >> > I have followed the best debugging practice there is (don't >> > worry, I'm being ironic :)), putting printfs around the possible >> > problem and I figured out that the vncpasswd hangs when calling >> > XOpenDisplay. When executing the code below, it prints "Opening >> > display :13" and nothing else. If I run it with strace, I get >> > the trace below. >> > >> > I didn't do the gdb test just because I am not very familiar >> > with doing it by gdb (I've already done it before, but there has >> > been centuries). If the following information is not enough and >> > gdb would help, I am sure I can handle it, no problem. I am just >> > trying to make thing easier for me! A little bit egocentric, I >> > know, but I just can't help it... :) >> > >> > If you check the strace output, it hangs on the last line, when >> > I seems to be trying to communicate through a socket >> > "/tmp/.X11-unix/X13". I imagine it is trying to reach Xvnc on >> > display 13 and it is getting no response, so it keeps waiting >> > forever... Am I right? What should I do if this is the case? >> > >> > Cheers, >> > >> > Rafael >> > >> > VNCPASSWD CODE WITH PRINTFS: >> > >> > int DoOTP() >> > { >> > unsigned int full; >> > unsigned int view = 0; >> > Display* dpy; >> > Atom prop; >> > int len; >> > char buf[MAXPWLEN + 1]; >> > char bytes[MAXPWLEN * 2]; >> > #ifdef UseDevUrandom >> > int fd; >> > #endif >> > >> > *printf("Opening display %s\n",displayname);* >> > if ((dpy = XOpenDisplay(displayname)) == NULL) { >> > fprintf(stderr, "unable to open display \"%s\"\n", >> > XDisplayName(displayname)); >> > return(1); >> > } >> > *printf("Display opened\n");* >> > >> > >> > STRACE: >> > >> > write(1, "Opening display :13\n", 20Opening display :13 >> > ) = 20 >> > brk(0) = 0x59f3000 >> > brk(0x5a14000) = 0x5a14000 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > connect(3, {sa_family=AF_FILE, path="/tmp/.X11-unix/X13"...}, >> > 20) = 0 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> > access("/u/cnhu/.Xauthority", R_OK) = 0 >> > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 >> > mmap(NULL, 32768, PROT_READ|PROT_WRITE, >> > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 >> > read(4, "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., >> > 32768) = 13139 >> > read(4, "", 32768) = 0 >> > close(4) = 0 >> > munmap(0x2b34aa97f000, 32768) = 0 >> > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, >> > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, >> > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource >> > temporarily unavailable) >> > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> >> > >> > >> > >> > >> > 2013/10/24 DRC <dco...@us... >> > <mailto:dco...@us...>> >> > >> > Can you figure out where in the vncpasswd code it's locking >> > up? That >> > would help diagnose the issue. If you're using an RPM-based >> > system, you >> > can install the turbovnc-debuginfo RPM and get a stack trace >> > from gdb >> > when it locks up. Otherwise, you'll have to build vncpasswd >> > from >> > source, but it's straightforward. >> > >> > This is a complete shot in the dark, but I'm wondering if >> > maybe the >> > /dev/urandom device is somehow giving you problems on some >> > of your >> > machines. vncpasswd will read from /dev/urandom when it >> > generates an >> > OTP. If that is the problem, then adding >> > >> > #undef UseDevUrandom >> > >> > to the top of vncpasswd.c should temporarily work around it, >> > and it >> > would be easy to add a more permanent workaround (a command >> > line switch >> > that avoids using /dev/urandom.) >> > >> > >> > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: >> > > Hi, >> > > >> > > I have created a VNC session and after using it for >> sometime, it seems >> > > to be "locked". It is configured to use >> one-time-passwords and I just >> > > can't generate a new otp for access the session, >> vncpasswd hangs >> > > indefinitely when trying to get a new password. >> > > >> > > The session was created by the following command line: >> > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd >> > > /opt/TurboVNC/bin/../vnc/classes -auth /u/cnhu/.Xauthority >> > > -dontdisconnect -geometry 3192x1046 -depth 24 -rfbwait >> 120000 -otpauth >> > > -rfbport 5913 -fp >> > > >> /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 >> > > -co /usr/share/X11/rgb -deferupdate 1 >> > > >> > > When trying to launch a "vncpasswd -otp -display :13", I >> get no response >> > > and eventually I have to press Ctrl+C. I have runned the >> vncpasswd >> > > command line with strace and it seems to hang while >> trying to connect to >> > > Xvnc through the "/tmp/.X11-unix/X13" socket (not sure if >> I've got it >> > > right). Below you may check the end of the strace (until >> it hangs). >> > > >> > > What may the problem be? The same happened twice in two >> different >> > > machines. I have several other VNC sessions where this >> does not happen. >> > > Can I collect information from anywhere else that could >> help me trying >> > > to figure out what happened? >> > > >> > > uname({sys="Linux", node="server03", ...}) = 0 >> > > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> > > uname({sys="Linux", node="server03", ...}) = 0 >> > > uname({sys="Linux", node="server03", ...}) = 0 >> > > connect(3, {sa_family=AF_FILE, >> path="/tmp/.X11-unix/X13"...}, 20) = 0 >> > > uname({sys="Linux", node="server03", ...}) = 0 >> > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> > > access("/u/cnhu/.Xauthority", R_OK) = 0 >> > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 >> > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, >> MAP_PRIVATE|MAP_ANONYMOUS, -1, >> > > 0) = 0x2ab9583b5000 >> > > read(4, >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 >> > > read(4, "", 32768) = 0 >> > > close(4) = 0 >> > > munmap(0x2ab9583b5000, 32768) = 0 >> > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, >> {"MIT-MAGIC-COOKIE-1", >> > > 18}, {"\0\0", 2}, {"e\30\226A>\316@\17Y8\365+V\21\365\f", >> 16}], 4) = 48 >> > > fcntl(3, F_GETFL) = 0x2 (flags >> O_RDWR) >> > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> > > read(3, 0x7fff0b98aab0, 8) = -1 EAGAIN >> (Resource >> > > temporarily unavailable) >> > > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> >> > >> > >> ------------------------------------------------------------------------------ >> > October Webinars: Code for Performance >> > Free Intel webinars can help you accelerate application >> > performance. >> > Explore tips for MPI, OpenMP, advanced profiling, and more. >> > Get the most from >> > the latest Intel processors and coprocessors. See abstracts >> > and register > >> > >> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >> > _______________________________________________ >> > VirtualGL-Users mailing list >> > Vir...@li... >> > <mailto:Vir...@li...> >> > >> https://lists.sourceforge.net/lists/listinfo/virtualgl-users >> > >> > >> > >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > October Webinars: Code for Performance >> > Free Intel webinars can help you accelerate application performance. >> > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >> most from >> > the latest Intel processors and coprocessors. See abstracts and >> register > >> > >> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >> > >> > >> > >> > _______________________________________________ >> > VirtualGL-Users mailing list >> > Vir...@li... >> > https://lists.sourceforge.net/lists/listinfo/virtualgl-users >> > >> >> >> ------------------------------------------------------------------------------ >> October Webinars: Code for Performance >> Free Intel webinars can help you accelerate application performance. >> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >> from >> the latest Intel processors and coprocessors. See abstracts and register > >> >> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >> _______________________________________________ >> VirtualGL-Users mailing list >> Vir...@li... >> https://lists.sourceforge.net/lists/listinfo/virtualgl-users >> > > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register >http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > > > > _______________________________________________ > VirtualGL-Users mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > |
From: DRC <dco...@us...> - 2013-10-25 22:24:09
|
OK, well, why didn't you mention that before? Your original message implied that the lock-up either just occurred randomly or that it occurred as a result of running vncpasswd. Now you're saying that it occurs as a result of doing something specific in OpenWorks, which makes a lot more sense. In the process of working with Santos, we've uncovered a few issues like this in the past that were specific to oil & gas apps (those apps seem to be very good at finding bugs in X servers.) I am looking into the mouse button issue, but since that just affects the viewer, you should still be able to use the 1.2 server. If you need the 1.1 applet, then just do the following: -- Install the 1.1 RPM -- mv /opt/TurboVNC/vnc/classes /tmp -- Upgrade to the 1.2 RPM -- mv /opt/TurboVNC/java /opt/TurboVNC/java.bak -- mv /tmp/classes /opt/TurboVNC/java The 1.2 server should now serve up the 1.1 applet. I looked at the change log, and I don't see anything obvious that has changed in the server since 1.1 that might affect this, unless you're running into the 32-bit limitation of the idle timer that was fixed in 1.2 beta (but I doubt it-- the symptoms of that issue were different.) Regardless, though, a lot has changed in 1.2, so I really do recommend upgrading. Minimally, it ensures that we're on the same page. On 10/25/13 12:55 PM, Rafael Guimaraes wrote: > In fact, I think that it may be difficult for you to replicate the > problem. I am not able to do it either... It happened twice with the > same user and he couldn't explain me what caused the locking. He was > using some software from Landmark (OpenWorks), something he does on a > daily basis and it the TurboVNC session closed. After that the Xvnc is > locked as I reported. > There is also one issue that I though I have mentioned, but I was > revising my emails and I haven't... I am using TurboVNC 1.1, because of > the bug I previously reported (problems when 2 mouse buttons are pressed > simultaneously). > > Anyway, gdb told me the following: > > #0 0x000000371f6ce3b3 in __select_nocancel () from /lib64/libc.so.6 > #1 0x000000000045883f in CheckConnections () at connection.c:997 > #2 0x0000000000456ba7 in WaitForSomething (pClientsReady=0xed922c0) at > WaitFor.c:356 > #3 0x0000000000442b19 in Dispatch () at dispatch.c:259 > #4 0x00000000004270fd in main (argc=24, argv=0x7fff92910dd8) at main.c:400 > > It keeps on this select even when I run vncpasswd. > > I will try to find if some specific function of OpenWorks is causing > Xvnc to hang so that I can have more information about the problem. > For the moment, since XOpenDisplay may hang, I have implemented a hack > in vncpasswd code (v. 1.2, since I tried it with Xvnc 1.1) that I would > like to share, if you think that this may be useful in order to be more > fault tolerant (it waits for 10 seconds for XOpenDisplay return, > otherwise it exits). > > --- vncpasswd.c 2012-09-30 21:06:15.000000000 -0300 > +++ /tmp/vncpasswd.c 2013-10-25 15:50:28.690931098 -0200 > @@ -37,7 +37,8 @@ > #include <sys/types.h> > #include <unistd.h> > #include <errno.h> > #include "vncauth.h" > +#include <signal.h> > > #include <X11/Xlib.h> > #include <X11/Xatom.h> > @@ -64,6 +65,20 @@ > int otpClear; > char* displayname; > > +void displaytimeout(int); > +int displayreply; > +#define DISPLAYTIMEOUT 10 > + > + > +void displaytimeout(int signum) > +{ > + if (!displayreply) { > + fprintf(stderr, "unable to communicate to display \"%s\"\n", > + XDisplayName(displayname)); > + exit(1); > + } > +} > + > > int DoOTP() > { > @@ -78,19 +93,21 @@ > int fd; > #endif > > + signal(SIGALRM, displaytimeout); > + alarm(DISPLAYTIMEOUT); > + displayreply=0; > if ((dpy = XOpenDisplay(displayname)) == NULL) { > fprintf(stderr, "unable to open display \"%s\"\n", > XDisplayName(displayname)); > return(1); > } > - > + displayreply=1; > prop = XInternAtom(dpy, "VNC_OTP", True); > if (prop == None) { > fprintf(stderr, "The X display \"%s\" does not support VNC > one-time passwords\n", > XDisplayName(displayname)); > return(1); > } > - > if (otpClear) { > len = 0; > > > Cheers, > > Rafael > > > > 2013/10/25 DRC <dco...@us... > <mailto:dco...@us...>> > > Really not sure what's going on here. I am running CentOS 5.9 as > well, and I launched Xvnc using the exact same command line that you > used below. I set up a script that generates 10,000 OTPs in rapid > succession, then I ran 10 copies of it in parallel, and all of them > ran to completion with no errors or lockups. I tried again using a > script that creates 1000 OTPs in rapid succession, and I ran 100 > copies of it. This was using the latest stable build of 1.2.x > available at http://virtualgl.sourceforge.net/vnc.nightly/, but > nothing has changed since 1.2 that would have affected this, so I > would expect identical behavior from 1.2. > > The fact that you have been running for years with no errors really > seems to indicate that something has changed at the system level. > We have not modified the way OTPs are generated, and in fact, I > actually don't think that feature has been touched since it was > first introduced in 1.0. > > One comment I will make is that you aren't setting up the VNC > password correctly below. You have to pass an argument of -rfbauth > to Xvnc in order to set it up with a "normal" VNC password. > > Rather than an strace output, it would be more useful to know where > in the Xvnc code the "infinite loop" is occurring. You should be > able to get that info with gdb. In the past, there have been > several cases in which I was able to diagnose an Xvnc error by > simply looking at the code, without even having to reproduce it. > > > On 10/25/13 6:46 AM, Rafael Guimaraes wrote: >> Replying what you asked: >> >> - I really didn't mention my OS. It is a CentOS 5.9 64 bits. It is >> pretty much updated. >> - Yes, when the X server locks up it remains locked. I have one >> Xvnc locked up right now and that's where I am performing all the >> tests. >> - Sockets permissions as well as its directories are ok, I have >> just checked it (they are all 777, with sticky bit for /tmp and >> /tmp/.X11-unix and setuid for the socket file) >> - When I run xwd on the same session, it also freezes. And it >> communicates with XVnc using a regular socket (through port 6013). >> This means that Xvnc is not responding the poll no matter how I >> communicate (through a local or a regular socket) and it also >> means that the problem is not restricted to generating OTP >> passwords. Based on that, I think that disabling local sockets >> won't help. >> - I have also been running a portal based on TurboVNC/VirtualGL >> for a couple of years and that's the first time that something >> similar happens. >> >> I don't know if this happens anyway, but I have runned strace on >> the problematic Xvnc process and it keeps doing just the same >> (looping over the same system calls shown below), even when I run >> a vncpasswd, while I can see a different behavior, i.e. normal >> processing, when I telnet to port 5913. >> >> select(128, [0 1 3 4 5 6 7 9 11 13 14 17 18 19 20 21 22 24 25 26 >> 27], NULL, NULL, {593, 19000}) = -1 EBADF (Bad file descriptor) >> select(6, [5], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(8, [7], NULL, NULL, {0, 0}) = 1 (in [7], left {0, 0}) >> select(12, [11], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(14, [13], NULL, NULL, {0, 0}) = 1 (in [13], left {0, 0}) >> select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(18, [17], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(19, [18], NULL, NULL, {0, 0}) = 1 (in [18], left {0, 0}) >> select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(21, [20], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(22, [21], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(23, [22], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(25, [24], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(26, [25], NULL, NULL, {0, 0}) = 1 (in [25], left {0, 0}) >> select(27, [26], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(28, [27], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(13, [3], NULL, NULL, {0, 0}) = 0 (Timeout) >> select(5, [4], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> - I also tried to set up a regular VNC password for the session (I >> did it by running "vncpasswd -d :13", is it correct? At that time, >> I monitored Xvnc with strace and it did nothing different from the >> selects shown above). Then I tried to open the VncViewer passing >> the password through command line. I've got the following output: >> >> Initializing... >> Connecting to localhost, port 5913... >> Connected to server >> Tentando conectar em localhost na porta 5555 >> RFB server supports protocol version 3.8 >> Using RFB protocol version 3.8 >> Enabling TightVNC protocol extensions >> Performing standard VNC authentication >> Error: The one-time password has not been set on the server >> java.lang.Exception: The one-time password has not been set on the >> server >> at RfbProto.readConnFailedReason(RfbProto.java:574) >> at RfbProto.readSecurityResult(RfbProto.java:556) >> at RfbProto.authenticateVNC(RfbProto.java:475) >> at VncViewer.connectAndAuthenticate(VncViewer.java:370) >> at VncViewer.run(VncViewer.java:188) >> at java.lang.Thread.run(Unknown Source) >> RFB socket closed >> Closing window >> Disconnecting >> >> - And the following strace for Xvnc >> >> accept(3, {sa_family=AF_INET, sin_port=htons(57854), >> sin_addr=inet_addr("127.0.0.1")}, [54348846876065808]) = 8 >> fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 >> setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 >> write(2, "\n", 1) = 1 >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> write(2, "25/10/2013 09:38:41 ", 20) = 20 >> write(2, "Got connection from client 127.0"..., 37) = 37 >> getpeername(8, {sa_family=AF_INET, sin_port=htons(57854), >> sin_addr=inet_addr("127.0.0.1")}, [34359738384]) = 0 >> write(8, "RFB 003.008\n", 12) = 12 >> read(8, "RFB 003.008\n", 12) = 12 >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> write(2, "Using protocol version 3.8\n", 27) = 27 >> write(8, "\2\2\20", 3) = 3 >> read(8, "\20", 1) = 1 >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> write(2, "Enabling TightVNC protocol exten"..., 38) = 38 >> write(8, "\0\0\0\0", 4) = 4 >> write(8, "\0\0\0\1", 4) = 4 >> write(8, "\0\0\0\2STDVVNCAUTH_", 16) = 16 >> read(8, "\0\0\0\2", 4) = 4 >> write(8, "!\236\\U\252\30\25\356\265:\254\275$=\35s", 16) = 16 >> read(8, "J\221#\247]\376\274I\370C\232\314h\334;\263", 16) = 16 >> write(8, "\0\0\0\1\0\0\0004", 8) = 8 >> write(8, "The one-time password has not be"..., 52) = 52 >> close(8) = 0 >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> write(2, "Client 127.0.0.1 gone\n", 22) = 22 >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> write(2, "Statistics:\n", 12) = 12 >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> write(2, " framebuffer updates 0, rectang"..., 47) = 47 >> >> Any ideas? >> >> Best regards, >> >> Rafael >> >> >> 2013/10/24 DRC <dco...@us... >> <mailto:dco...@us...>> >> >> Since Xvnc is single-threaded, if you are able to telnet in on >> 59xx, >> that means that the server is not locked up. That leaves the >> following >> as possibilities in my mind: >> >> -- Perhaps this is due to a problem in the X11 client library. >> If so, >> then there's probably nothing I can do about it. Try updating >> your >> system to the latest O/S patches? >> >> -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. >> They should >> be writable by whoever used is launching the VNC server. >> >> -- If all else fails, then perhaps you can work around it by >> disabling >> local X11 communications. You do this by starting Xvnc with >> '-nolisten >> local', which forces it to listen for X11 traffic on a TCP >> socket rather >> than a local socket, so it will not use /tmp/.X11-unix/X* in >> that case. >> >> Several things that I still don't understand about this situation: >> >> -- If you mentioned what O/S these machines are running, I >> missed it, so >> please provide that information. If it's an O/S I haven't >> tested, then >> I'm happy to try reproducing it on my end, but you implied >> that there >> was nothing different about the machines that were >> experiencing the bug >> vs. the machines that aren't (?) >> >> -- When the X server "locks up", does it remain locked? That >> is, do >> repeated attempts to run vncpasswd fail in the same way? >> >> -- Have you tried to assign a regular VNC password to the >> session? That >> would give you a way to log in and verify whether it is still >> properly >> accepting client connections once vncpasswd starts failing. >> >> I'm not saying that there's no possibility of a bug in >> TurboVNC, but we >> do have thousands of seats worldwide that are using portals >> built around >> the OTP functionality. These portals are dynamically >> generating OTPs >> every time a user launches or reconnects to a session, so it >> seems to me >> that if this was a bug in TurboVNC, someone else would have >> stumbled >> upon it by now. >> >> >> On 10/24/13 9:26 AM, Rafael Guimaraes wrote: >> > Hi DRC, >> > >> > Just some last information... I have tried to telnet Xvnc >> (in ports 5813 >> > and 5913, since I using display 13) in order to see if the >> Xvnc process >> > is completely locked or not and it seems to be responding to my >> > connections correclty. >> > >> > I have opened a telnet to port 5813, sent a "GET >> /VncViewer.jar" and was >> > able to download the applet. Then I opened a telnet to port >> 5913, >> > received "RFB 003.008", send "RFB 003.003" and got some >> binary info I >> > was not able to figure out... But it seems to be responding >> correctly, >> > it is not completely locked. I don't know what further tests >> should I do >> > for finding out what's happening. Since it has already >> happened twice, >> > in two different machines, I would like to know as much as I >> possible in >> > order to avoid it from happening again in the future... >> > >> > Cheers, >> > >> > Rafael >> > >> > >> > 2013/10/24 Rafael Guimaraes <rg...@gm... >> <mailto:rg...@gm...> <mailto:rg...@gm... >> <mailto:rg...@gm...>>> >> > >> > Hi DRC, >> > >> > I overcame my laziness and launched vncpasswd with gdb. >> The result >> > was pretty much the same of what strace and printfs have >> shown. The >> > problem really seems to be in the communication with the >> Xvnc server... >> > >> > Cheers, >> > >> > Rafael >> > >> > >> > >> > Reading symbols from >> /opt/TurboVNC/bin/vncpasswd...Reading symbols >> > from /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. >> > done. >> > (gdb) r -o -display :13 >> > Starting program: /opt/TurboVNC/bin/vncpasswd -o >> -display :13 >> > >> > Program received signal SIGINT, Interrupt. >> > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 >> > (gdb) bt >> > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 >> > #1 0x000000372124a9f0 in ?? () from /usr/lib64/libX11.so.6 >> > #2 0x000000372124ae19 in _XRead () from >> /usr/lib64/libX11.so.6 >> > #3 0x00000037212378c9 in XOpenDisplay () from >> /usr/lib64/libX11.so.6 >> > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 >> > #5 0x0000000000402187 in main (argc=4, >> argv=0x7fffffffe748) at >> > vncpasswd.c:293 >> > (gdb) >> > >> > >> > >> > 2013/10/24 Rafael Guimaraes <rg...@gm... >> <mailto:rg...@gm...> <mailto:rg...@gm... >> <mailto:rg...@gm...>>> >> > >> > Hi DRC, >> > >> > I have followed the best debugging practice there is >> (don't >> > worry, I'm being ironic :)), putting printfs around >> the possible >> > problem and I figured out that the vncpasswd hangs >> when calling >> > XOpenDisplay. When executing the code below, it >> prints "Opening >> > display :13" and nothing else. If I run it with >> strace, I get >> > the trace below. >> > >> > I didn't do the gdb test just because I am not very >> familiar >> > with doing it by gdb (I've already done it before, >> but there has >> > been centuries). If the following information is not >> enough and >> > gdb would help, I am sure I can handle it, no >> problem. I am just >> > trying to make thing easier for me! A little bit >> egocentric, I >> > know, but I just can't help it... :) >> > >> > If you check the strace output, it hangs on the last >> line, when >> > I seems to be trying to communicate through a socket >> > "/tmp/.X11-unix/X13". I imagine it is trying to >> reach Xvnc on >> > display 13 and it is getting no response, so it >> keeps waiting >> > forever... Am I right? What should I do if this is >> the case? >> > >> > Cheers, >> > >> > Rafael >> > >> > VNCPASSWD CODE WITH PRINTFS: >> > >> > int DoOTP() >> > { >> > unsigned int full; >> > unsigned int view = 0; >> > Display* dpy; >> > Atom prop; >> > int len; >> > char buf[MAXPWLEN + 1]; >> > char bytes[MAXPWLEN * 2]; >> > #ifdef UseDevUrandom >> > int fd; >> > #endif >> > >> > *printf("Opening display %s\n",displayname);* >> > if ((dpy = XOpenDisplay(displayname)) == NULL) { >> > fprintf(stderr, "unable to open display \"%s\"\n", >> > XDisplayName(displayname)); >> > return(1); >> > } >> > *printf("Display opened\n");* >> > >> > >> > STRACE: >> > >> > write(1, "Opening display :13\n", 20Opening display :13 >> > ) = 20 >> > brk(0) = 0x59f3000 >> > brk(0x5a14000) = 0x5a14000 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > connect(3, {sa_family=AF_FILE, >> path="/tmp/.X11-unix/X13"...}, >> > 20) = 0 >> > uname({sys="Linux", node="server03", ...}) = 0 >> > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> > access("/u/cnhu/.Xauthority", R_OK) = 0 >> > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 >> > mmap(NULL, 32768, PROT_READ|PROT_WRITE, >> > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 >> > read(4, >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., >> > 32768) = 13139 >> > read(4, "", 32768) = 0 >> > close(4) = 0 >> > munmap(0x2b34aa97f000, 32768) = 0 >> > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, >> > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, >> > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource >> > temporarily unavailable) >> > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> >> > >> > >> > >> > >> > 2013/10/24 DRC <dco...@us... >> <mailto:dco...@us...> >> > <mailto:dco...@us... >> <mailto:dco...@us...>>> >> > >> > Can you figure out where in the vncpasswd code >> it's locking >> > up? That >> > would help diagnose the issue. If you're using >> an RPM-based >> > system, you >> > can install the turbovnc-debuginfo RPM and get a >> stack trace >> > from gdb >> > when it locks up. Otherwise, you'll have to >> build vncpasswd >> > from >> > source, but it's straightforward. >> > >> > This is a complete shot in the dark, but I'm >> wondering if >> > maybe the >> > /dev/urandom device is somehow giving you >> problems on some >> > of your >> > machines. vncpasswd will read from /dev/urandom >> when it >> > generates an >> > OTP. If that is the problem, then adding >> > >> > #undef UseDevUrandom >> > >> > to the top of vncpasswd.c should temporarily >> work around it, >> > and it >> > would be easy to add a more permanent workaround >> (a command >> > line switch >> > that avoids using /dev/urandom.) >> > >> > >> > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: >> > > Hi, >> > > >> > > I have created a VNC session and after using >> it for sometime, it seems >> > > to be "locked". It is configured to use >> one-time-passwords and I just >> > > can't generate a new otp for access the >> session, vncpasswd hangs >> > > indefinitely when trying to get a new password. >> > > >> > > The session was created by the following >> command line: >> > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd >> > > /opt/TurboVNC/bin/../vnc/classes -auth >> /u/cnhu/.Xauthority >> > > -dontdisconnect -geometry 3192x1046 -depth 24 >> -rfbwait 120000 -otpauth >> > > -rfbport 5913 -fp >> > > >> /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 >> > > -co /usr/share/X11/rgb -deferupdate 1 >> > > >> > > When trying to launch a "vncpasswd -otp >> -display :13", I get no response >> > > and eventually I have to press Ctrl+C. I have >> runned the vncpasswd >> > > command line with strace and it seems to hang >> while trying to connect to >> > > Xvnc through the "/tmp/.X11-unix/X13" socket >> (not sure if I've got it >> > > right). Below you may check the end of the >> strace (until it hangs). >> > > >> > > What may the problem be? The same happened >> twice in two different >> > > machines. I have several other VNC sessions >> where this does not happen. >> > > Can I collect information from anywhere else >> that could help me trying >> > > to figure out what happened? >> > > >> > > uname({sys="Linux", node="server03", ...}) = 0 >> > > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> > > uname({sys="Linux", node="server03", ...}) = 0 >> > > uname({sys="Linux", node="server03", ...}) = 0 >> > > connect(3, {sa_family=AF_FILE, >> path="/tmp/.X11-unix/X13"...}, 20) = 0 >> > > uname({sys="Linux", node="server03", ...}) = 0 >> > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> > > access("/u/cnhu/.Xauthority", R_OK) = 0 >> > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, >> ...}) = 0 >> > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, >> MAP_PRIVATE|MAP_ANONYMOUS, -1, >> > > 0) = 0x2ab9583b5000 >> > > read(4, >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 >> > > read(4, "", 32768) = 0 >> > > close(4) = 0 >> > > munmap(0x2ab9583b5000, 32768) = 0 >> > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, >> {"MIT-MAGIC-COOKIE-1", >> > > 18}, {"\0\0", 2}, >> {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> > > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> > > read(3, 0x7fff0b98aab0, 8) = -1 >> EAGAIN (Resource >> > > temporarily unavailable) >> > > poll([{fd=3, events=POLLIN}], 1, -1 >> <unfinished ...> |
From: Rafael G. <rg...@gm...> - 2013-10-26 13:19:49
|
Thanks. I will try it! Em 25/10/2013 20:24, "DRC" <dco...@us...> escreveu: > OK, well, why didn't you mention that before? Your original message > implied that the lock-up either just occurred randomly or that it > occurred as a result of running vncpasswd. Now you're saying that it > occurs as a result of doing something specific in OpenWorks, which makes > a lot more sense. In the process of working with Santos, we've > uncovered a few issues like this in the past that were specific to oil & > gas apps (those apps seem to be very good at finding bugs in X servers.) > > I am looking into the mouse button issue, but since that just affects > the viewer, you should still be able to use the 1.2 server. If you need > the 1.1 applet, then just do the following: > > -- Install the 1.1 RPM > -- mv /opt/TurboVNC/vnc/classes /tmp > -- Upgrade to the 1.2 RPM > -- mv /opt/TurboVNC/java /opt/TurboVNC/java.bak > -- mv /tmp/classes /opt/TurboVNC/java > > The 1.2 server should now serve up the 1.1 applet. > > I looked at the change log, and I don't see anything obvious that has > changed in the server since 1.1 that might affect this, unless you're > running into the 32-bit limitation of the idle timer that was fixed in > 1.2 beta (but I doubt it-- the symptoms of that issue were different.) > Regardless, though, a lot has changed in 1.2, so I really do recommend > upgrading. Minimally, it ensures that we're on the same page. > > > On 10/25/13 12:55 PM, Rafael Guimaraes wrote: > > In fact, I think that it may be difficult for you to replicate the > > problem. I am not able to do it either... It happened twice with the > > same user and he couldn't explain me what caused the locking. He was > > using some software from Landmark (OpenWorks), something he does on a > > daily basis and it the TurboVNC session closed. After that the Xvnc is > > locked as I reported. > > There is also one issue that I though I have mentioned, but I was > > revising my emails and I haven't... I am using TurboVNC 1.1, because of > > the bug I previously reported (problems when 2 mouse buttons are pressed > > simultaneously). > > > > Anyway, gdb told me the following: > > > > #0 0x000000371f6ce3b3 in __select_nocancel () from /lib64/libc.so.6 > > #1 0x000000000045883f in CheckConnections () at connection.c:997 > > #2 0x0000000000456ba7 in WaitForSomething (pClientsReady=0xed922c0) at > > WaitFor.c:356 > > #3 0x0000000000442b19 in Dispatch () at dispatch.c:259 > > #4 0x00000000004270fd in main (argc=24, argv=0x7fff92910dd8) at > main.c:400 > > > > It keeps on this select even when I run vncpasswd. > > > > I will try to find if some specific function of OpenWorks is causing > > Xvnc to hang so that I can have more information about the problem. > > For the moment, since XOpenDisplay may hang, I have implemented a hack > > in vncpasswd code (v. 1.2, since I tried it with Xvnc 1.1) that I would > > like to share, if you think that this may be useful in order to be more > > fault tolerant (it waits for 10 seconds for XOpenDisplay return, > > otherwise it exits). > > > > --- vncpasswd.c 2012-09-30 21:06:15.000000000 -0300 > > +++ /tmp/vncpasswd.c 2013-10-25 15:50:28.690931098 -0200 > > @@ -37,7 +37,8 @@ > > #include <sys/types.h> > > #include <unistd.h> > > #include <errno.h> > > #include "vncauth.h" > > +#include <signal.h> > > > > #include <X11/Xlib.h> > > #include <X11/Xatom.h> > > @@ -64,6 +65,20 @@ > > int otpClear; > > char* displayname; > > > > +void displaytimeout(int); > > +int displayreply; > > +#define DISPLAYTIMEOUT 10 > > + > > + > > +void displaytimeout(int signum) > > +{ > > + if (!displayreply) { > > + fprintf(stderr, "unable to communicate to display \"%s\"\n", > > + XDisplayName(displayname)); > > + exit(1); > > + } > > +} > > + > > > > int DoOTP() > > { > > @@ -78,19 +93,21 @@ > > int fd; > > #endif > > > > + signal(SIGALRM, displaytimeout); > > + alarm(DISPLAYTIMEOUT); > > + displayreply=0; > > if ((dpy = XOpenDisplay(displayname)) == NULL) { > > fprintf(stderr, "unable to open display \"%s\"\n", > > XDisplayName(displayname)); > > return(1); > > } > > - > > + displayreply=1; > > prop = XInternAtom(dpy, "VNC_OTP", True); > > if (prop == None) { > > fprintf(stderr, "The X display \"%s\" does not support VNC > > one-time passwords\n", > > XDisplayName(displayname)); > > return(1); > > } > > - > > if (otpClear) { > > len = 0; > > > > > > Cheers, > > > > Rafael > > > > > > > > 2013/10/25 DRC <dco...@us... > > <mailto:dco...@us...>> > > > > Really not sure what's going on here. I am running CentOS 5.9 as > > well, and I launched Xvnc using the exact same command line that you > > used below. I set up a script that generates 10,000 OTPs in rapid > > succession, then I ran 10 copies of it in parallel, and all of them > > ran to completion with no errors or lockups. I tried again using a > > script that creates 1000 OTPs in rapid succession, and I ran 100 > > copies of it. This was using the latest stable build of 1.2.x > > available at http://virtualgl.sourceforge.net/vnc.nightly/, but > > nothing has changed since 1.2 that would have affected this, so I > > would expect identical behavior from 1.2. > > > > The fact that you have been running for years with no errors really > > seems to indicate that something has changed at the system level. > > We have not modified the way OTPs are generated, and in fact, I > > actually don't think that feature has been touched since it was > > first introduced in 1.0. > > > > One comment I will make is that you aren't setting up the VNC > > password correctly below. You have to pass an argument of -rfbauth > > to Xvnc in order to set it up with a "normal" VNC password. > > > > Rather than an strace output, it would be more useful to know where > > in the Xvnc code the "infinite loop" is occurring. You should be > > able to get that info with gdb. In the past, there have been > > several cases in which I was able to diagnose an Xvnc error by > > simply looking at the code, without even having to reproduce it. > > > > > > On 10/25/13 6:46 AM, Rafael Guimaraes wrote: > >> Replying what you asked: > >> > >> - I really didn't mention my OS. It is a CentOS 5.9 64 bits. It is > >> pretty much updated. > >> - Yes, when the X server locks up it remains locked. I have one > >> Xvnc locked up right now and that's where I am performing all the > >> tests. > >> - Sockets permissions as well as its directories are ok, I have > >> just checked it (they are all 777, with sticky bit for /tmp and > >> /tmp/.X11-unix and setuid for the socket file) > >> - When I run xwd on the same session, it also freezes. And it > >> communicates with XVnc using a regular socket (through port 6013). > >> This means that Xvnc is not responding the poll no matter how I > >> communicate (through a local or a regular socket) and it also > >> means that the problem is not restricted to generating OTP > >> passwords. Based on that, I think that disabling local sockets > >> won't help. > >> - I have also been running a portal based on TurboVNC/VirtualGL > >> for a couple of years and that's the first time that something > >> similar happens. > >> > >> I don't know if this happens anyway, but I have runned strace on > >> the problematic Xvnc process and it keeps doing just the same > >> (looping over the same system calls shown below), even when I run > >> a vncpasswd, while I can see a different behavior, i.e. normal > >> processing, when I telnet to port 5913. > >> > >> select(128, [0 1 3 4 5 6 7 9 11 13 14 17 18 19 20 21 22 24 25 26 > >> 27], NULL, NULL, {593, 19000}) = -1 EBADF (Bad file descriptor) > >> select(6, [5], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(8, [7], NULL, NULL, {0, 0}) = 1 (in [7], left {0, 0}) > >> select(12, [11], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(14, [13], NULL, NULL, {0, 0}) = 1 (in [13], left {0, 0}) > >> select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(18, [17], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(19, [18], NULL, NULL, {0, 0}) = 1 (in [18], left {0, 0}) > >> select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(21, [20], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(22, [21], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(23, [22], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(25, [24], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(26, [25], NULL, NULL, {0, 0}) = 1 (in [25], left {0, 0}) > >> select(27, [26], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(28, [27], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(13, [3], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(5, [4], NULL, NULL, {0, 0}) = 0 (Timeout) > >> > >> - I also tried to set up a regular VNC password for the session (I > >> did it by running "vncpasswd -d :13", is it correct? At that time, > >> I monitored Xvnc with strace and it did nothing different from the > >> selects shown above). Then I tried to open the VncViewer passing > >> the password through command line. I've got the following output: > >> > >> Initializing... > >> Connecting to localhost, port 5913... > >> Connected to server > >> Tentando conectar em localhost na porta 5555 > >> RFB server supports protocol version 3.8 > >> Using RFB protocol version 3.8 > >> Enabling TightVNC protocol extensions > >> Performing standard VNC authentication > >> Error: The one-time password has not been set on the server > >> java.lang.Exception: The one-time password has not been set on the > >> server > >> at RfbProto.readConnFailedReason(RfbProto.java:574) > >> at RfbProto.readSecurityResult(RfbProto.java:556) > >> at RfbProto.authenticateVNC(RfbProto.java:475) > >> at VncViewer.connectAndAuthenticate(VncViewer.java:370) > >> at VncViewer.run(VncViewer.java:188) > >> at java.lang.Thread.run(Unknown Source) > >> RFB socket closed > >> Closing window > >> Disconnecting > >> > >> - And the following strace for Xvnc > >> > >> accept(3, {sa_family=AF_INET, sin_port=htons(57854), > >> sin_addr=inet_addr("127.0.0.1")}, [54348846876065808]) = 8 > >> fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 > >> setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 > >> write(2, "\n", 1) = 1 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:41 ", 20) = 20 > >> write(2, "Got connection from client 127.0"..., 37) = 37 > >> getpeername(8, {sa_family=AF_INET, sin_port=htons(57854), > >> sin_addr=inet_addr("127.0.0.1")}, [34359738384]) = 0 > >> write(8, "RFB 003.008\n", 12) = 12 > >> read(8, "RFB 003.008\n", 12) = 12 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, "Using protocol version 3.8\n", 27) = 27 > >> write(8, "\2\2\20", 3) = 3 > >> read(8, "\20", 1) = 1 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, "Enabling TightVNC protocol exten"..., 38) = 38 > >> write(8, "\0\0\0\0", 4) = 4 > >> write(8, "\0\0\0\1", 4) = 4 > >> write(8, "\0\0\0\2STDVVNCAUTH_", 16) = 16 > >> read(8, "\0\0\0\2", 4) = 4 > >> write(8, "!\236\\U\252\30\25\356\265:\254\275$=\35s", 16) = 16 > >> read(8, "J\221#\247]\376\274I\370C\232\314h\334;\263", 16) = 16 > >> write(8, "\0\0\0\1\0\0\0004", 8) = 8 > >> write(8, "The one-time password has not be"..., 52) = 52 > >> close(8) = 0 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, "Client 127.0.0.1 gone\n", 22) = 22 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, "Statistics:\n", 12) = 12 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, " framebuffer updates 0, rectang"..., 47) = 47 > >> > >> Any ideas? > >> > >> Best regards, > >> > >> Rafael > >> > >> > >> 2013/10/24 DRC <dco...@us... > >> <mailto:dco...@us...>> > >> > >> Since Xvnc is single-threaded, if you are able to telnet in on > >> 59xx, > >> that means that the server is not locked up. That leaves the > >> following > >> as possibilities in my mind: > >> > >> -- Perhaps this is due to a problem in the X11 client library. > >> If so, > >> then there's probably nothing I can do about it. Try updating > >> your > >> system to the latest O/S patches? > >> > >> -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. > >> They should > >> be writable by whoever used is launching the VNC server. > >> > >> -- If all else fails, then perhaps you can work around it by > >> disabling > >> local X11 communications. You do this by starting Xvnc with > >> '-nolisten > >> local', which forces it to listen for X11 traffic on a TCP > >> socket rather > >> than a local socket, so it will not use /tmp/.X11-unix/X* in > >> that case. > >> > >> Several things that I still don't understand about this > situation: > >> > >> -- If you mentioned what O/S these machines are running, I > >> missed it, so > >> please provide that information. If it's an O/S I haven't > >> tested, then > >> I'm happy to try reproducing it on my end, but you implied > >> that there > >> was nothing different about the machines that were > >> experiencing the bug > >> vs. the machines that aren't (?) > >> > >> -- When the X server "locks up", does it remain locked? That > >> is, do > >> repeated attempts to run vncpasswd fail in the same way? > >> > >> -- Have you tried to assign a regular VNC password to the > >> session? That > >> would give you a way to log in and verify whether it is still > >> properly > >> accepting client connections once vncpasswd starts failing. > >> > >> I'm not saying that there's no possibility of a bug in > >> TurboVNC, but we > >> do have thousands of seats worldwide that are using portals > >> built around > >> the OTP functionality. These portals are dynamically > >> generating OTPs > >> every time a user launches or reconnects to a session, so it > >> seems to me > >> that if this was a bug in TurboVNC, someone else would have > >> stumbled > >> upon it by now. > >> > >> > >> On 10/24/13 9:26 AM, Rafael Guimaraes wrote: > >> > Hi DRC, > >> > > >> > Just some last information... I have tried to telnet Xvnc > >> (in ports 5813 > >> > and 5913, since I using display 13) in order to see if the > >> Xvnc process > >> > is completely locked or not and it seems to be responding to > my > >> > connections correclty. > >> > > >> > I have opened a telnet to port 5813, sent a "GET > >> /VncViewer.jar" and was > >> > able to download the applet. Then I opened a telnet to port > >> 5913, > >> > received "RFB 003.008", send "RFB 003.003" and got some > >> binary info I > >> > was not able to figure out... But it seems to be responding > >> correctly, > >> > it is not completely locked. I don't know what further tests > >> should I do > >> > for finding out what's happening. Since it has already > >> happened twice, > >> > in two different machines, I would like to know as much as I > >> possible in > >> > order to avoid it from happening again in the future... > >> > > >> > Cheers, > >> > > >> > Rafael > >> > > >> > > >> > 2013/10/24 Rafael Guimaraes <rg...@gm... > >> <mailto:rg...@gm...> <mailto:rg...@gm... > >> <mailto:rg...@gm...>>> > >> > > >> > Hi DRC, > >> > > >> > I overcame my laziness and launched vncpasswd with gdb. > >> The result > >> > was pretty much the same of what strace and printfs have > >> shown. The > >> > problem really seems to be in the communication with the > >> Xvnc server... > >> > > >> > Cheers, > >> > > >> > Rafael > >> > > >> > > >> > > >> > Reading symbols from > >> /opt/TurboVNC/bin/vncpasswd...Reading symbols > >> > from > /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. > >> > done. > >> > (gdb) r -o -display :13 > >> > Starting program: /opt/TurboVNC/bin/vncpasswd -o > >> -display :13 > >> > > >> > Program received signal SIGINT, Interrupt. > >> > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > >> > (gdb) bt > >> > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > >> > #1 0x000000372124a9f0 in ?? () from > /usr/lib64/libX11.so.6 > >> > #2 0x000000372124ae19 in _XRead () from > >> /usr/lib64/libX11.so.6 > >> > #3 0x00000037212378c9 in XOpenDisplay () from > >> /usr/lib64/libX11.so.6 > >> > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 > >> > #5 0x0000000000402187 in main (argc=4, > >> argv=0x7fffffffe748) at > >> > vncpasswd.c:293 > >> > (gdb) > >> > > >> > > >> > > >> > 2013/10/24 Rafael Guimaraes <rg...@gm... > >> <mailto:rg...@gm...> <mailto:rg...@gm... > >> <mailto:rg...@gm...>>> > >> > > >> > Hi DRC, > >> > > >> > I have followed the best debugging practice there is > >> (don't > >> > worry, I'm being ironic :)), putting printfs around > >> the possible > >> > problem and I figured out that the vncpasswd hangs > >> when calling > >> > XOpenDisplay. When executing the code below, it > >> prints "Opening > >> > display :13" and nothing else. If I run it with > >> strace, I get > >> > the trace below. > >> > > >> > I didn't do the gdb test just because I am not very > >> familiar > >> > with doing it by gdb (I've already done it before, > >> but there has > >> > been centuries). If the following information is not > >> enough and > >> > gdb would help, I am sure I can handle it, no > >> problem. I am just > >> > trying to make thing easier for me! A little bit > >> egocentric, I > >> > know, but I just can't help it... :) > >> > > >> > If you check the strace output, it hangs on the last > >> line, when > >> > I seems to be trying to communicate through a socket > >> > "/tmp/.X11-unix/X13". I imagine it is trying to > >> reach Xvnc on > >> > display 13 and it is getting no response, so it > >> keeps waiting > >> > forever... Am I right? What should I do if this is > >> the case? > >> > > >> > Cheers, > >> > > >> > Rafael > >> > > >> > VNCPASSWD CODE WITH PRINTFS: > >> > > >> > int DoOTP() > >> > { > >> > unsigned int full; > >> > unsigned int view = 0; > >> > Display* dpy; > >> > Atom prop; > >> > int len; > >> > char buf[MAXPWLEN + 1]; > >> > char bytes[MAXPWLEN * 2]; > >> > #ifdef UseDevUrandom > >> > int fd; > >> > #endif > >> > > >> > *printf("Opening display %s\n",displayname);* > >> > if ((dpy = XOpenDisplay(displayname)) == NULL) { > >> > fprintf(stderr, "unable to open display > \"%s\"\n", > >> > XDisplayName(displayname)); > >> > return(1); > >> > } > >> > *printf("Display opened\n");* > >> > > >> > > >> > STRACE: > >> > > >> > write(1, "Opening display :13\n", 20Opening display > :13 > >> > ) = 20 > >> > brk(0) = 0x59f3000 > >> > brk(0x5a14000) = 0x5a14000 > >> > uname({sys="Linux", node="server03", ...}) = 0 > >> > socket(PF_FILE, SOCK_STREAM, 0) = 3 > >> > uname({sys="Linux", node="server03", ...}) = 0 > >> > uname({sys="Linux", node="server03", ...}) = 0 > >> > connect(3, {sa_family=AF_FILE, > >> path="/tmp/.X11-unix/X13"...}, > >> > 20) = 0 > >> > uname({sys="Linux", node="server03", ...}) = 0 > >> > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > >> > access("/u/cnhu/.Xauthority", R_OK) = 0 > >> > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > >> > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) > = 0 > >> > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > >> > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 > >> > read(4, > >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., > >> > 32768) = 13139 > >> > read(4, "", 32768) = 0 > >> > close(4) = 0 > >> > munmap(0x2b34aa97f000, 32768) = 0 > >> > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > >> > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, > >> > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > >> > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > >> > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > >> > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource > >> > temporarily unavailable) > >> > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > >> > > >> > > >> > > >> > > >> > 2013/10/24 DRC <dco...@us... > >> <mailto:dco...@us...> > >> > <mailto:dco...@us... > >> <mailto:dco...@us...>>> > >> > > >> > Can you figure out where in the vncpasswd code > >> it's locking > >> > up? That > >> > would help diagnose the issue. If you're using > >> an RPM-based > >> > system, you > >> > can install the turbovnc-debuginfo RPM and get a > >> stack trace > >> > from gdb > >> > when it locks up. Otherwise, you'll have to > >> build vncpasswd > >> > from > >> > source, but it's straightforward. > >> > > >> > This is a complete shot in the dark, but I'm > >> wondering if > >> > maybe the > >> > /dev/urandom device is somehow giving you > >> problems on some > >> > of your > >> > machines. vncpasswd will read from /dev/urandom > >> when it > >> > generates an > >> > OTP. If that is the problem, then adding > >> > > >> > #undef UseDevUrandom > >> > > >> > to the top of vncpasswd.c should temporarily > >> work around it, > >> > and it > >> > would be easy to add a more permanent workaround > >> (a command > >> > line switch > >> > that avoids using /dev/urandom.) > >> > > >> > > >> > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: > >> > > Hi, > >> > > > >> > > I have created a VNC session and after using > >> it for sometime, it seems > >> > > to be "locked". It is configured to use > >> one-time-passwords and I just > >> > > can't generate a new otp for access the > >> session, vncpasswd hangs > >> > > indefinitely when trying to get a new password. > >> > > > >> > > The session was created by the following > >> command line: > >> > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd > >> > > /opt/TurboVNC/bin/../vnc/classes -auth > >> /u/cnhu/.Xauthority > >> > > -dontdisconnect -geometry 3192x1046 -depth 24 > >> -rfbwait 120000 -otpauth > >> > > -rfbport 5913 -fp > >> > > > >> > /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 > >> > > -co /usr/share/X11/rgb -deferupdate 1 > >> > > > >> > > When trying to launch a "vncpasswd -otp > >> -display :13", I get no response > >> > > and eventually I have to press Ctrl+C. I have > >> runned the vncpasswd > >> > > command line with strace and it seems to hang > >> while trying to connect to > >> > > Xvnc through the "/tmp/.X11-unix/X13" socket > >> (not sure if I've got it > >> > > right). Below you may check the end of the > >> strace (until it hangs). > >> > > > >> > > What may the problem be? The same happened > >> twice in two different > >> > > machines. I have several other VNC sessions > >> where this does not happen. > >> > > Can I collect information from anywhere else > >> that could help me trying > >> > > to figure out what happened? > >> > > > >> > > uname({sys="Linux", node="server03", ...}) = 0 > >> > > socket(PF_FILE, SOCK_STREAM, 0) = 3 > >> > > uname({sys="Linux", node="server03", ...}) = 0 > >> > > uname({sys="Linux", node="server03", ...}) = 0 > >> > > connect(3, {sa_family=AF_FILE, > >> path="/tmp/.X11-unix/X13"...}, 20) = 0 > >> > > uname({sys="Linux", node="server03", ...}) = 0 > >> > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > >> > > access("/u/cnhu/.Xauthority", R_OK) = 0 > >> > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > >> > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, > >> ...}) = 0 > >> > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > >> MAP_PRIVATE|MAP_ANONYMOUS, -1, > >> > > 0) = 0x2ab9583b5000 > >> > > read(4, > >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = > 13139 > >> > > read(4, "", 32768) = 0 > >> > > close(4) = 0 > >> > > munmap(0x2ab9583b5000, 32768) = 0 > >> > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > >> {"MIT-MAGIC-COOKIE-1", > >> > > 18}, {"\0\0", 2}, > >> {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > >> > > fcntl(3, F_GETFL) = 0x2 (flags > O_RDWR) > >> > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > >> > > read(3, 0x7fff0b98aab0, 8) = -1 > >> EAGAIN (Resource > >> > > temporarily unavailable) > >> > > poll([{fd=3, events=POLLIN}], 1, -1 > >> <unfinished ...> > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > |
From: Rafael G. <rg...@gm...> - 2013-10-28 18:07:01
|
Sorry for not being specific since the beginning. I though I had mentioned it. Anyway, I will upgrade to the nightly release (1.2.1) and hope it doesn't happen again. However, don't you think it would be nice to have some kind of protection on vncpasswd (like the patch I've implemented here and sent you) in order to avoid that external bugs cause vncpasswd to freeze (as it happened here)? I think it could improve vncpasswd fault-tolerance. Best regards, Rafael 2013/10/25 DRC <dco...@us...> > OK, well, why didn't you mention that before? Your original message > implied that the lock-up either just occurred randomly or that it > occurred as a result of running vncpasswd. Now you're saying that it > occurs as a result of doing something specific in OpenWorks, which makes > a lot more sense. In the process of working with Santos, we've > uncovered a few issues like this in the past that were specific to oil & > gas apps (those apps seem to be very good at finding bugs in X servers.) > > I am looking into the mouse button issue, but since that just affects > the viewer, you should still be able to use the 1.2 server. If you need > the 1.1 applet, then just do the following: > > -- Install the 1.1 RPM > -- mv /opt/TurboVNC/vnc/classes /tmp > -- Upgrade to the 1.2 RPM > -- mv /opt/TurboVNC/java /opt/TurboVNC/java.bak > -- mv /tmp/classes /opt/TurboVNC/java > > The 1.2 server should now serve up the 1.1 applet. > > I looked at the change log, and I don't see anything obvious that has > changed in the server since 1.1 that might affect this, unless you're > running into the 32-bit limitation of the idle timer that was fixed in > 1.2 beta (but I doubt it-- the symptoms of that issue were different.) > Regardless, though, a lot has changed in 1.2, so I really do recommend > upgrading. Minimally, it ensures that we're on the same page. > > > On 10/25/13 12:55 PM, Rafael Guimaraes wrote: > > In fact, I think that it may be difficult for you to replicate the > > problem. I am not able to do it either... It happened twice with the > > same user and he couldn't explain me what caused the locking. He was > > using some software from Landmark (OpenWorks), something he does on a > > daily basis and it the TurboVNC session closed. After that the Xvnc is > > locked as I reported. > > There is also one issue that I though I have mentioned, but I was > > revising my emails and I haven't... I am using TurboVNC 1.1, because of > > the bug I previously reported (problems when 2 mouse buttons are pressed > > simultaneously). > > > > Anyway, gdb told me the following: > > > > #0 0x000000371f6ce3b3 in __select_nocancel () from /lib64/libc.so.6 > > #1 0x000000000045883f in CheckConnections () at connection.c:997 > > #2 0x0000000000456ba7 in WaitForSomething (pClientsReady=0xed922c0) at > > WaitFor.c:356 > > #3 0x0000000000442b19 in Dispatch () at dispatch.c:259 > > #4 0x00000000004270fd in main (argc=24, argv=0x7fff92910dd8) at > main.c:400 > > > > It keeps on this select even when I run vncpasswd. > > > > I will try to find if some specific function of OpenWorks is causing > > Xvnc to hang so that I can have more information about the problem. > > For the moment, since XOpenDisplay may hang, I have implemented a hack > > in vncpasswd code (v. 1.2, since I tried it with Xvnc 1.1) that I would > > like to share, if you think that this may be useful in order to be more > > fault tolerant (it waits for 10 seconds for XOpenDisplay return, > > otherwise it exits). > > > > --- vncpasswd.c 2012-09-30 21:06:15.000000000 -0300 > > +++ /tmp/vncpasswd.c 2013-10-25 15:50:28.690931098 -0200 > > @@ -37,7 +37,8 @@ > > #include <sys/types.h> > > #include <unistd.h> > > #include <errno.h> > > #include "vncauth.h" > > +#include <signal.h> > > > > #include <X11/Xlib.h> > > #include <X11/Xatom.h> > > @@ -64,6 +65,20 @@ > > int otpClear; > > char* displayname; > > > > +void displaytimeout(int); > > +int displayreply; > > +#define DISPLAYTIMEOUT 10 > > + > > + > > +void displaytimeout(int signum) > > +{ > > + if (!displayreply) { > > + fprintf(stderr, "unable to communicate to display \"%s\"\n", > > + XDisplayName(displayname)); > > + exit(1); > > + } > > +} > > + > > > > int DoOTP() > > { > > @@ -78,19 +93,21 @@ > > int fd; > > #endif > > > > + signal(SIGALRM, displaytimeout); > > + alarm(DISPLAYTIMEOUT); > > + displayreply=0; > > if ((dpy = XOpenDisplay(displayname)) == NULL) { > > fprintf(stderr, "unable to open display \"%s\"\n", > > XDisplayName(displayname)); > > return(1); > > } > > - > > + displayreply=1; > > prop = XInternAtom(dpy, "VNC_OTP", True); > > if (prop == None) { > > fprintf(stderr, "The X display \"%s\" does not support VNC > > one-time passwords\n", > > XDisplayName(displayname)); > > return(1); > > } > > - > > if (otpClear) { > > len = 0; > > > > > > Cheers, > > > > Rafael > > > > > > > > 2013/10/25 DRC <dco...@us... > > <mailto:dco...@us...>> > > > > Really not sure what's going on here. I am running CentOS 5.9 as > > well, and I launched Xvnc using the exact same command line that you > > used below. I set up a script that generates 10,000 OTPs in rapid > > succession, then I ran 10 copies of it in parallel, and all of them > > ran to completion with no errors or lockups. I tried again using a > > script that creates 1000 OTPs in rapid succession, and I ran 100 > > copies of it. This was using the latest stable build of 1.2.x > > available at http://virtualgl.sourceforge.net/vnc.nightly/, but > > nothing has changed since 1.2 that would have affected this, so I > > would expect identical behavior from 1.2. > > > > The fact that you have been running for years with no errors really > > seems to indicate that something has changed at the system level. > > We have not modified the way OTPs are generated, and in fact, I > > actually don't think that feature has been touched since it was > > first introduced in 1.0. > > > > One comment I will make is that you aren't setting up the VNC > > password correctly below. You have to pass an argument of -rfbauth > > to Xvnc in order to set it up with a "normal" VNC password. > > > > Rather than an strace output, it would be more useful to know where > > in the Xvnc code the "infinite loop" is occurring. You should be > > able to get that info with gdb. In the past, there have been > > several cases in which I was able to diagnose an Xvnc error by > > simply looking at the code, without even having to reproduce it. > > > > > > On 10/25/13 6:46 AM, Rafael Guimaraes wrote: > >> Replying what you asked: > >> > >> - I really didn't mention my OS. It is a CentOS 5.9 64 bits. It is > >> pretty much updated. > >> - Yes, when the X server locks up it remains locked. I have one > >> Xvnc locked up right now and that's where I am performing all the > >> tests. > >> - Sockets permissions as well as its directories are ok, I have > >> just checked it (they are all 777, with sticky bit for /tmp and > >> /tmp/.X11-unix and setuid for the socket file) > >> - When I run xwd on the same session, it also freezes. And it > >> communicates with XVnc using a regular socket (through port 6013). > >> This means that Xvnc is not responding the poll no matter how I > >> communicate (through a local or a regular socket) and it also > >> means that the problem is not restricted to generating OTP > >> passwords. Based on that, I think that disabling local sockets > >> won't help. > >> - I have also been running a portal based on TurboVNC/VirtualGL > >> for a couple of years and that's the first time that something > >> similar happens. > >> > >> I don't know if this happens anyway, but I have runned strace on > >> the problematic Xvnc process and it keeps doing just the same > >> (looping over the same system calls shown below), even when I run > >> a vncpasswd, while I can see a different behavior, i.e. normal > >> processing, when I telnet to port 5913. > >> > >> select(128, [0 1 3 4 5 6 7 9 11 13 14 17 18 19 20 21 22 24 25 26 > >> 27], NULL, NULL, {593, 19000}) = -1 EBADF (Bad file descriptor) > >> select(6, [5], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(8, [7], NULL, NULL, {0, 0}) = 1 (in [7], left {0, 0}) > >> select(12, [11], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(14, [13], NULL, NULL, {0, 0}) = 1 (in [13], left {0, 0}) > >> select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(18, [17], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(19, [18], NULL, NULL, {0, 0}) = 1 (in [18], left {0, 0}) > >> select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(21, [20], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(22, [21], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(23, [22], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(25, [24], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(26, [25], NULL, NULL, {0, 0}) = 1 (in [25], left {0, 0}) > >> select(27, [26], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(28, [27], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(13, [3], NULL, NULL, {0, 0}) = 0 (Timeout) > >> select(5, [4], NULL, NULL, {0, 0}) = 0 (Timeout) > >> > >> - I also tried to set up a regular VNC password for the session (I > >> did it by running "vncpasswd -d :13", is it correct? At that time, > >> I monitored Xvnc with strace and it did nothing different from the > >> selects shown above). Then I tried to open the VncViewer passing > >> the password through command line. I've got the following output: > >> > >> Initializing... > >> Connecting to localhost, port 5913... > >> Connected to server > >> Tentando conectar em localhost na porta 5555 > >> RFB server supports protocol version 3.8 > >> Using RFB protocol version 3.8 > >> Enabling TightVNC protocol extensions > >> Performing standard VNC authentication > >> Error: The one-time password has not been set on the server > >> java.lang.Exception: The one-time password has not been set on the > >> server > >> at RfbProto.readConnFailedReason(RfbProto.java:574) > >> at RfbProto.readSecurityResult(RfbProto.java:556) > >> at RfbProto.authenticateVNC(RfbProto.java:475) > >> at VncViewer.connectAndAuthenticate(VncViewer.java:370) > >> at VncViewer.run(VncViewer.java:188) > >> at java.lang.Thread.run(Unknown Source) > >> RFB socket closed > >> Closing window > >> Disconnecting > >> > >> - And the following strace for Xvnc > >> > >> accept(3, {sa_family=AF_INET, sin_port=htons(57854), > >> sin_addr=inet_addr("127.0.0.1")}, [54348846876065808]) = 8 > >> fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 > >> setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 > >> write(2, "\n", 1) = 1 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:41 ", 20) = 20 > >> write(2, "Got connection from client 127.0"..., 37) = 37 > >> getpeername(8, {sa_family=AF_INET, sin_port=htons(57854), > >> sin_addr=inet_addr("127.0.0.1")}, [34359738384]) = 0 > >> write(8, "RFB 003.008\n", 12) = 12 > >> read(8, "RFB 003.008\n", 12) = 12 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, "Using protocol version 3.8\n", 27) = 27 > >> write(8, "\2\2\20", 3) = 3 > >> read(8, "\20", 1) = 1 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, "Enabling TightVNC protocol exten"..., 38) = 38 > >> write(8, "\0\0\0\0", 4) = 4 > >> write(8, "\0\0\0\1", 4) = 4 > >> write(8, "\0\0\0\2STDVVNCAUTH_", 16) = 16 > >> read(8, "\0\0\0\2", 4) = 4 > >> write(8, "!\236\\U\252\30\25\356\265:\254\275$=\35s", 16) = 16 > >> read(8, "J\221#\247]\376\274I\370C\232\314h\334;\263", 16) = 16 > >> write(8, "\0\0\0\1\0\0\0004", 8) = 8 > >> write(8, "The one-time password has not be"..., 52) = 52 > >> close(8) = 0 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, "Client 127.0.0.1 gone\n", 22) = 22 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, "Statistics:\n", 12) = 12 > >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = > 0 > >> write(2, "25/10/2013 09:38:44 ", 20) = 20 > >> write(2, " framebuffer updates 0, rectang"..., 47) = 47 > >> > >> Any ideas? > >> > >> Best regards, > >> > >> Rafael > >> > >> > >> 2013/10/24 DRC <dco...@us... > >> <mailto:dco...@us...>> > >> > >> Since Xvnc is single-threaded, if you are able to telnet in on > >> 59xx, > >> that means that the server is not locked up. That leaves the > >> following > >> as possibilities in my mind: > >> > >> -- Perhaps this is due to a problem in the X11 client library. > >> If so, > >> then there's probably nothing I can do about it. Try updating > >> your > >> system to the latest O/S patches? > >> > >> -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. > >> They should > >> be writable by whoever used is launching the VNC server. > >> > >> -- If all else fails, then perhaps you can work around it by > >> disabling > >> local X11 communications. You do this by starting Xvnc with > >> '-nolisten > >> local', which forces it to listen for X11 traffic on a TCP > >> socket rather > >> than a local socket, so it will not use /tmp/.X11-unix/X* in > >> that case. > >> > >> Several things that I still don't understand about this > situation: > >> > >> -- If you mentioned what O/S these machines are running, I > >> missed it, so > >> please provide that information. If it's an O/S I haven't > >> tested, then > >> I'm happy to try reproducing it on my end, but you implied > >> that there > >> was nothing different about the machines that were > >> experiencing the bug > >> vs. the machines that aren't (?) > >> > >> -- When the X server "locks up", does it remain locked? That > >> is, do > >> repeated attempts to run vncpasswd fail in the same way? > >> > >> -- Have you tried to assign a regular VNC password to the > >> session? That > >> would give you a way to log in and verify whether it is still > >> properly > >> accepting client connections once vncpasswd starts failing. > >> > >> I'm not saying that there's no possibility of a bug in > >> TurboVNC, but we > >> do have thousands of seats worldwide that are using portals > >> built around > >> the OTP functionality. These portals are dynamically > >> generating OTPs > >> every time a user launches or reconnects to a session, so it > >> seems to me > >> that if this was a bug in TurboVNC, someone else would have > >> stumbled > >> upon it by now. > >> > >> > >> On 10/24/13 9:26 AM, Rafael Guimaraes wrote: > >> > Hi DRC, > >> > > >> > Just some last information... I have tried to telnet Xvnc > >> (in ports 5813 > >> > and 5913, since I using display 13) in order to see if the > >> Xvnc process > >> > is completely locked or not and it seems to be responding to > my > >> > connections correclty. > >> > > >> > I have opened a telnet to port 5813, sent a "GET > >> /VncViewer.jar" and was > >> > able to download the applet. Then I opened a telnet to port > >> 5913, > >> > received "RFB 003.008", send "RFB 003.003" and got some > >> binary info I > >> > was not able to figure out... But it seems to be responding > >> correctly, > >> > it is not completely locked. I don't know what further tests > >> should I do > >> > for finding out what's happening. Since it has already > >> happened twice, > >> > in two different machines, I would like to know as much as I > >> possible in > >> > order to avoid it from happening again in the future... > >> > > >> > Cheers, > >> > > >> > Rafael > >> > > >> > > >> > 2013/10/24 Rafael Guimaraes <rg...@gm... > >> <mailto:rg...@gm...> <mailto:rg...@gm... > >> <mailto:rg...@gm...>>> > >> > > >> > Hi DRC, > >> > > >> > I overcame my laziness and launched vncpasswd with gdb. > >> The result > >> > was pretty much the same of what strace and printfs have > >> shown. The > >> > problem really seems to be in the communication with the > >> Xvnc server... > >> > > >> > Cheers, > >> > > >> > Rafael > >> > > >> > > >> > > >> > Reading symbols from > >> /opt/TurboVNC/bin/vncpasswd...Reading symbols > >> > from > /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. > >> > done. > >> > (gdb) r -o -display :13 > >> > Starting program: /opt/TurboVNC/bin/vncpasswd -o > >> -display :13 > >> > > >> > Program received signal SIGINT, Interrupt. > >> > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > >> > (gdb) bt > >> > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 > >> > #1 0x000000372124a9f0 in ?? () from > /usr/lib64/libX11.so.6 > >> > #2 0x000000372124ae19 in _XRead () from > >> /usr/lib64/libX11.so.6 > >> > #3 0x00000037212378c9 in XOpenDisplay () from > >> /usr/lib64/libX11.so.6 > >> > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 > >> > #5 0x0000000000402187 in main (argc=4, > >> argv=0x7fffffffe748) at > >> > vncpasswd.c:293 > >> > (gdb) > >> > > >> > > >> > > >> > 2013/10/24 Rafael Guimaraes <rg...@gm... > >> <mailto:rg...@gm...> <mailto:rg...@gm... > >> <mailto:rg...@gm...>>> > >> > > >> > Hi DRC, > >> > > >> > I have followed the best debugging practice there is > >> (don't > >> > worry, I'm being ironic :)), putting printfs around > >> the possible > >> > problem and I figured out that the vncpasswd hangs > >> when calling > >> > XOpenDisplay. When executing the code below, it > >> prints "Opening > >> > display :13" and nothing else. If I run it with > >> strace, I get > >> > the trace below. > >> > > >> > I didn't do the gdb test just because I am not very > >> familiar > >> > with doing it by gdb (I've already done it before, > >> but there has > >> > been centuries). If the following information is not > >> enough and > >> > gdb would help, I am sure I can handle it, no > >> problem. I am just > >> > trying to make thing easier for me! A little bit > >> egocentric, I > >> > know, but I just can't help it... :) > >> > > >> > If you check the strace output, it hangs on the last > >> line, when > >> > I seems to be trying to communicate through a socket > >> > "/tmp/.X11-unix/X13". I imagine it is trying to > >> reach Xvnc on > >> > display 13 and it is getting no response, so it > >> keeps waiting > >> > forever... Am I right? What should I do if this is > >> the case? > >> > > >> > Cheers, > >> > > >> > Rafael > >> > > >> > VNCPASSWD CODE WITH PRINTFS: > >> > > >> > int DoOTP() > >> > { > >> > unsigned int full; > >> > unsigned int view = 0; > >> > Display* dpy; > >> > Atom prop; > >> > int len; > >> > char buf[MAXPWLEN + 1]; > >> > char bytes[MAXPWLEN * 2]; > >> > #ifdef UseDevUrandom > >> > int fd; > >> > #endif > >> > > >> > *printf("Opening display %s\n",displayname);* > >> > if ((dpy = XOpenDisplay(displayname)) == NULL) { > >> > fprintf(stderr, "unable to open display > \"%s\"\n", > >> > XDisplayName(displayname)); > >> > return(1); > >> > } > >> > *printf("Display opened\n");* > >> > > >> > > >> > STRACE: > >> > > >> > write(1, "Opening display :13\n", 20Opening display > :13 > >> > ) = 20 > >> > brk(0) = 0x59f3000 > >> > brk(0x5a14000) = 0x5a14000 > >> > uname({sys="Linux", node="server03", ...}) = 0 > >> > socket(PF_FILE, SOCK_STREAM, 0) = 3 > >> > uname({sys="Linux", node="server03", ...}) = 0 > >> > uname({sys="Linux", node="server03", ...}) = 0 > >> > connect(3, {sa_family=AF_FILE, > >> path="/tmp/.X11-unix/X13"...}, > >> > 20) = 0 > >> > uname({sys="Linux", node="server03", ...}) = 0 > >> > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > >> > access("/u/cnhu/.Xauthority", R_OK) = 0 > >> > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > >> > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) > = 0 > >> > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > >> > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 > >> > read(4, > >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., > >> > 32768) = 13139 > >> > read(4, "", 32768) = 0 > >> > close(4) = 0 > >> > munmap(0x2b34aa97f000, 32768) = 0 > >> > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > >> > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, > >> > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > >> > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) > >> > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > >> > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource > >> > temporarily unavailable) > >> > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> > >> > > >> > > >> > > >> > > >> > 2013/10/24 DRC <dco...@us... > >> <mailto:dco...@us...> > >> > <mailto:dco...@us... > >> <mailto:dco...@us...>>> > >> > > >> > Can you figure out where in the vncpasswd code > >> it's locking > >> > up? That > >> > would help diagnose the issue. If you're using > >> an RPM-based > >> > system, you > >> > can install the turbovnc-debuginfo RPM and get a > >> stack trace > >> > from gdb > >> > when it locks up. Otherwise, you'll have to > >> build vncpasswd > >> > from > >> > source, but it's straightforward. > >> > > >> > This is a complete shot in the dark, but I'm > >> wondering if > >> > maybe the > >> > /dev/urandom device is somehow giving you > >> problems on some > >> > of your > >> > machines. vncpasswd will read from /dev/urandom > >> when it > >> > generates an > >> > OTP. If that is the problem, then adding > >> > > >> > #undef UseDevUrandom > >> > > >> > to the top of vncpasswd.c should temporarily > >> work around it, > >> > and it > >> > would be easy to add a more permanent workaround > >> (a command > >> > line switch > >> > that avoids using /dev/urandom.) > >> > > >> > > >> > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: > >> > > Hi, > >> > > > >> > > I have created a VNC session and after using > >> it for sometime, it seems > >> > > to be "locked". It is configured to use > >> one-time-passwords and I just > >> > > can't generate a new otp for access the > >> session, vncpasswd hangs > >> > > indefinitely when trying to get a new password. > >> > > > >> > > The session was created by the following > >> command line: > >> > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd > >> > > /opt/TurboVNC/bin/../vnc/classes -auth > >> /u/cnhu/.Xauthority > >> > > -dontdisconnect -geometry 3192x1046 -depth 24 > >> -rfbwait 120000 -otpauth > >> > > -rfbport 5913 -fp > >> > > > >> > /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 > >> > > -co /usr/share/X11/rgb -deferupdate 1 > >> > > > >> > > When trying to launch a "vncpasswd -otp > >> -display :13", I get no response > >> > > and eventually I have to press Ctrl+C. I have > >> runned the vncpasswd > >> > > command line with strace and it seems to hang > >> while trying to connect to > >> > > Xvnc through the "/tmp/.X11-unix/X13" socket > >> (not sure if I've got it > >> > > right). Below you may check the end of the > >> strace (until it hangs). > >> > > > >> > > What may the problem be? The same happened > >> twice in two different > >> > > machines. I have several other VNC sessions > >> where this does not happen. > >> > > Can I collect information from anywhere else > >> that could help me trying > >> > > to figure out what happened? > >> > > > >> > > uname({sys="Linux", node="server03", ...}) = 0 > >> > > socket(PF_FILE, SOCK_STREAM, 0) = 3 > >> > > uname({sys="Linux", node="server03", ...}) = 0 > >> > > uname({sys="Linux", node="server03", ...}) = 0 > >> > > connect(3, {sa_family=AF_FILE, > >> path="/tmp/.X11-unix/X13"...}, 20) = 0 > >> > > uname({sys="Linux", node="server03", ...}) = 0 > >> > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 > >> > > access("/u/cnhu/.Xauthority", R_OK) = 0 > >> > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 > >> > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, > >> ...}) = 0 > >> > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, > >> MAP_PRIVATE|MAP_ANONYMOUS, -1, > >> > > 0) = 0x2ab9583b5000 > >> > > read(4, > >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = > 13139 > >> > > read(4, "", 32768) = 0 > >> > > close(4) = 0 > >> > > munmap(0x2ab9583b5000, 32768) = 0 > >> > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, > >> {"MIT-MAGIC-COOKIE-1", > >> > > 18}, {"\0\0", 2}, > >> {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 > >> > > fcntl(3, F_GETFL) = 0x2 (flags > O_RDWR) > >> > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 > >> > > read(3, 0x7fff0b98aab0, 8) = -1 > >> EAGAIN (Resource > >> > > temporarily unavailable) > >> > > poll([{fd=3, events=POLLIN}], 1, -1 > >> <unfinished ...> > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > |
From: DRC <dco...@us...> - 2013-10-28 18:20:51
|
I don't like introducing code that adds complexity and masks an issue. My understanding from you is that, when this lock-up occurs, the X server becomes generally unresponsive, so I don't understand why working around the issue only in vncpasswd would make any difference. Other apps would still not work. There is nothing wrong with the way vncpasswd is written. It is written like every other X11 app on the planet. They all expect that XOpenDisplay() will be well-behaved, and if it isn't, then that needs to be addressed in the X server. > On Oct 28, 2013, at 1:06 PM, Rafael Guimaraes <rg...@gm...> wrote: > > Sorry for not being specific since the beginning. I though I had mentioned it. Anyway, I will upgrade to the nightly release (1.2.1) and hope it doesn't happen again. > However, don't you think it would be nice to have some kind of protection on vncpasswd (like the patch I've implemented here and sent you) in order to avoid that external bugs cause vncpasswd to freeze (as it happened here)? I think it could improve vncpasswd fault-tolerance. > > Best regards, > > Rafael > > 2013/10/25 DRC <dco...@us...> >> OK, well, why didn't you mention that before? Your original message >> implied that the lock-up either just occurred randomly or that it >> occurred as a result of running vncpasswd. Now you're saying that it >> occurs as a result of doing something specific in OpenWorks, which makes >> a lot more sense. In the process of working with Santos, we've >> uncovered a few issues like this in the past that were specific to oil & >> gas apps (those apps seem to be very good at finding bugs in X servers.) >> >> I am looking into the mouse button issue, but since that just affects >> the viewer, you should still be able to use the 1.2 server. If you need >> the 1.1 applet, then just do the following: >> >> -- Install the 1.1 RPM >> -- mv /opt/TurboVNC/vnc/classes /tmp >> -- Upgrade to the 1.2 RPM >> -- mv /opt/TurboVNC/java /opt/TurboVNC/java.bak >> -- mv /tmp/classes /opt/TurboVNC/java >> >> The 1.2 server should now serve up the 1.1 applet. >> >> I looked at the change log, and I don't see anything obvious that has >> changed in the server since 1.1 that might affect this, unless you're >> running into the 32-bit limitation of the idle timer that was fixed in >> 1.2 beta (but I doubt it-- the symptoms of that issue were different.) >> Regardless, though, a lot has changed in 1.2, so I really do recommend >> upgrading. Minimally, it ensures that we're on the same page. >> >> >> On 10/25/13 12:55 PM, Rafael Guimaraes wrote: >> > In fact, I think that it may be difficult for you to replicate the >> > problem. I am not able to do it either... It happened twice with the >> > same user and he couldn't explain me what caused the locking. He was >> > using some software from Landmark (OpenWorks), something he does on a >> > daily basis and it the TurboVNC session closed. After that the Xvnc is >> > locked as I reported. >> > There is also one issue that I though I have mentioned, but I was >> > revising my emails and I haven't... I am using TurboVNC 1.1, because of >> > the bug I previously reported (problems when 2 mouse buttons are pressed >> > simultaneously). >> > >> > Anyway, gdb told me the following: >> > >> > #0 0x000000371f6ce3b3 in __select_nocancel () from /lib64/libc.so.6 >> > #1 0x000000000045883f in CheckConnections () at connection.c:997 >> > #2 0x0000000000456ba7 in WaitForSomething (pClientsReady=0xed922c0) at >> > WaitFor.c:356 >> > #3 0x0000000000442b19 in Dispatch () at dispatch.c:259 >> > #4 0x00000000004270fd in main (argc=24, argv=0x7fff92910dd8) at main.c:400 >> > >> > It keeps on this select even when I run vncpasswd. >> > >> > I will try to find if some specific function of OpenWorks is causing >> > Xvnc to hang so that I can have more information about the problem. >> > For the moment, since XOpenDisplay may hang, I have implemented a hack >> > in vncpasswd code (v. 1.2, since I tried it with Xvnc 1.1) that I would >> > like to share, if you think that this may be useful in order to be more >> > fault tolerant (it waits for 10 seconds for XOpenDisplay return, >> > otherwise it exits). >> > >> > --- vncpasswd.c 2012-09-30 21:06:15.000000000 -0300 >> > +++ /tmp/vncpasswd.c 2013-10-25 15:50:28.690931098 -0200 >> > @@ -37,7 +37,8 @@ >> > #include <sys/types.h> >> > #include <unistd.h> >> > #include <errno.h> >> > #include "vncauth.h" >> > +#include <signal.h> >> > >> > #include <X11/Xlib.h> >> > #include <X11/Xatom.h> >> > @@ -64,6 +65,20 @@ >> > int otpClear; >> > char* displayname; >> > >> > +void displaytimeout(int); >> > +int displayreply; >> > +#define DISPLAYTIMEOUT 10 >> > + >> > + >> > +void displaytimeout(int signum) >> > +{ >> > + if (!displayreply) { >> > + fprintf(stderr, "unable to communicate to display \"%s\"\n", >> > + XDisplayName(displayname)); >> > + exit(1); >> > + } >> > +} >> > + >> > >> > int DoOTP() >> > { >> > @@ -78,19 +93,21 @@ >> > int fd; >> > #endif >> > >> > + signal(SIGALRM, displaytimeout); >> > + alarm(DISPLAYTIMEOUT); >> > + displayreply=0; >> > if ((dpy = XOpenDisplay(displayname)) == NULL) { >> > fprintf(stderr, "unable to open display \"%s\"\n", >> > XDisplayName(displayname)); >> > return(1); >> > } >> > - >> > + displayreply=1; >> > prop = XInternAtom(dpy, "VNC_OTP", True); >> > if (prop == None) { >> > fprintf(stderr, "The X display \"%s\" does not support VNC >> > one-time passwords\n", >> > XDisplayName(displayname)); >> > return(1); >> > } >> > - >> > if (otpClear) { >> > len = 0; >> > >> > >> > Cheers, >> > >> > Rafael >> > >> > >> > >> > 2013/10/25 DRC <dco...@us... >> > <mailto:dco...@us...>> >> > >> > Really not sure what's going on here. I am running CentOS 5.9 as >> > well, and I launched Xvnc using the exact same command line that you >> > used below. I set up a script that generates 10,000 OTPs in rapid >> > succession, then I ran 10 copies of it in parallel, and all of them >> > ran to completion with no errors or lockups. I tried again using a >> > script that creates 1000 OTPs in rapid succession, and I ran 100 >> > copies of it. This was using the latest stable build of 1.2.x >> > available at http://virtualgl.sourceforge.net/vnc.nightly/, but >> > nothing has changed since 1.2 that would have affected this, so I >> > would expect identical behavior from 1.2. >> > >> > The fact that you have been running for years with no errors really >> > seems to indicate that something has changed at the system level. >> > We have not modified the way OTPs are generated, and in fact, I >> > actually don't think that feature has been touched since it was >> > first introduced in 1.0. >> > >> > One comment I will make is that you aren't setting up the VNC >> > password correctly below. You have to pass an argument of -rfbauth >> > to Xvnc in order to set it up with a "normal" VNC password. >> > >> > Rather than an strace output, it would be more useful to know where >> > in the Xvnc code the "infinite loop" is occurring. You should be >> > able to get that info with gdb. In the past, there have been >> > several cases in which I was able to diagnose an Xvnc error by >> > simply looking at the code, without even having to reproduce it. >> > >> > >> > On 10/25/13 6:46 AM, Rafael Guimaraes wrote: >> >> Replying what you asked: >> >> >> >> - I really didn't mention my OS. It is a CentOS 5.9 64 bits. It is >> >> pretty much updated. >> >> - Yes, when the X server locks up it remains locked. I have one >> >> Xvnc locked up right now and that's where I am performing all the >> >> tests. >> >> - Sockets permissions as well as its directories are ok, I have >> >> just checked it (they are all 777, with sticky bit for /tmp and >> >> /tmp/.X11-unix and setuid for the socket file) >> >> - When I run xwd on the same session, it also freezes. And it >> >> communicates with XVnc using a regular socket (through port 6013). >> >> This means that Xvnc is not responding the poll no matter how I >> >> communicate (through a local or a regular socket) and it also >> >> means that the problem is not restricted to generating OTP >> >> passwords. Based on that, I think that disabling local sockets >> >> won't help. >> >> - I have also been running a portal based on TurboVNC/VirtualGL >> >> for a couple of years and that's the first time that something >> >> similar happens. >> >> >> >> I don't know if this happens anyway, but I have runned strace on >> >> the problematic Xvnc process and it keeps doing just the same >> >> (looping over the same system calls shown below), even when I run >> >> a vncpasswd, while I can see a different behavior, i.e. normal >> >> processing, when I telnet to port 5913. >> >> >> >> select(128, [0 1 3 4 5 6 7 9 11 13 14 17 18 19 20 21 22 24 25 26 >> >> 27], NULL, NULL, {593, 19000}) = -1 EBADF (Bad file descriptor) >> >> select(6, [5], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(8, [7], NULL, NULL, {0, 0}) = 1 (in [7], left {0, 0}) >> >> select(12, [11], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(14, [13], NULL, NULL, {0, 0}) = 1 (in [13], left {0, 0}) >> >> select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(18, [17], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(19, [18], NULL, NULL, {0, 0}) = 1 (in [18], left {0, 0}) >> >> select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(21, [20], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(22, [21], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(23, [22], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(25, [24], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(26, [25], NULL, NULL, {0, 0}) = 1 (in [25], left {0, 0}) >> >> select(27, [26], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(28, [27], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(13, [3], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(5, [4], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> >> >> - I also tried to set up a regular VNC password for the session (I >> >> did it by running "vncpasswd -d :13", is it correct? At that time, >> >> I monitored Xvnc with strace and it did nothing different from the >> >> selects shown above). Then I tried to open the VncViewer passing >> >> the password through command line. I've got the following output: >> >> >> >> Initializing... >> >> Connecting to localhost, port 5913... >> >> Connected to server >> >> Tentando conectar em localhost na porta 5555 >> >> RFB server supports protocol version 3.8 >> >> Using RFB protocol version 3.8 >> >> Enabling TightVNC protocol extensions >> >> Performing standard VNC authentication >> >> Error: The one-time password has not been set on the server >> >> java.lang.Exception: The one-time password has not been set on the >> >> server >> >> at RfbProto.readConnFailedReason(RfbProto.java:574) >> >> at RfbProto.readSecurityResult(RfbProto.java:556) >> >> at RfbProto.authenticateVNC(RfbProto.java:475) >> >> at VncViewer.connectAndAuthenticate(VncViewer.java:370) >> >> at VncViewer.run(VncViewer.java:188) >> >> at java.lang.Thread.run(Unknown Source) >> >> RFB socket closed >> >> Closing window >> >> Disconnecting >> >> >> >> - And the following strace for Xvnc >> >> >> >> accept(3, {sa_family=AF_INET, sin_port=htons(57854), >> >> sin_addr=inet_addr("127.0.0.1")}, [54348846876065808]) = 8 >> >> fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 >> >> setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 >> >> write(2, "\n", 1) = 1 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> >> write(2, "25/10/2013 09:38:41 ", 20) = 20 >> >> write(2, "Got connection from client 127.0"..., 37) = 37 >> >> getpeername(8, {sa_family=AF_INET, sin_port=htons(57854), >> >> sin_addr=inet_addr("127.0.0.1")}, [34359738384]) = 0 >> >> write(8, "RFB 003.008\n", 12) = 12 >> >> read(8, "RFB 003.008\n", 12) = 12 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, "Using protocol version 3.8\n", 27) = 27 >> >> write(8, "\2\2\20", 3) = 3 >> >> read(8, "\20", 1) = 1 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, "Enabling TightVNC protocol exten"..., 38) = 38 >> >> write(8, "\0\0\0\0", 4) = 4 >> >> write(8, "\0\0\0\1", 4) = 4 >> >> write(8, "\0\0\0\2STDVVNCAUTH_", 16) = 16 >> >> read(8, "\0\0\0\2", 4) = 4 >> >> write(8, "!\236\\U\252\30\25\356\265:\254\275$=\35s", 16) = 16 >> >> read(8, "J\221#\247]\376\274I\370C\232\314h\334;\263", 16) = 16 >> >> write(8, "\0\0\0\1\0\0\0004", 8) = 8 >> >> write(8, "The one-time password has not be"..., 52) = 52 >> >> close(8) = 0 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, "Client 127.0.0.1 gone\n", 22) = 22 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, "Statistics:\n", 12) = 12 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, " framebuffer updates 0, rectang"..., 47) = 47 >> >> >> >> Any ideas? >> >> >> >> Best regards, >> >> >> >> Rafael >> >> >> >> >> >> 2013/10/24 DRC <dco...@us... >> >> <mailto:dco...@us...>> >> >> >> >> Since Xvnc is single-threaded, if you are able to telnet in on >> >> 59xx, >> >> that means that the server is not locked up. That leaves the >> >> following >> >> as possibilities in my mind: >> >> >> >> -- Perhaps this is due to a problem in the X11 client library. >> >> If so, >> >> then there's probably nothing I can do about it. Try updating >> >> your >> >> system to the latest O/S patches? >> >> >> >> -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. >> >> They should >> >> be writable by whoever used is launching the VNC server. >> >> >> >> -- If all else fails, then perhaps you can work around it by >> >> disabling >> >> local X11 communications. You do this by starting Xvnc with >> >> '-nolisten >> >> local', which forces it to listen for X11 traffic on a TCP >> >> socket rather >> >> than a local socket, so it will not use /tmp/.X11-unix/X* in >> >> that case. >> >> >> >> Several things that I still don't understand about this situation: >> >> >> >> -- If you mentioned what O/S these machines are running, I >> >> missed it, so >> >> please provide that information. If it's an O/S I haven't >> >> tested, then >> >> I'm happy to try reproducing it on my end, but you implied >> >> that there >> >> was nothing different about the machines that were >> >> experiencing the bug >> >> vs. the machines that aren't (?) >> >> >> >> -- When the X server "locks up", does it remain locked? That >> >> is, do >> >> repeated attempts to run vncpasswd fail in the same way? >> >> >> >> -- Have you tried to assign a regular VNC password to the >> >> session? That >> >> would give you a way to log in and verify whether it is still >> >> properly >> >> accepting client connections once vncpasswd starts failing. >> >> >> >> I'm not saying that there's no possibility of a bug in >> >> TurboVNC, but we >> >> do have thousands of seats worldwide that are using portals >> >> built around >> >> the OTP functionality. These portals are dynamically >> >> generating OTPs >> >> every time a user launches or reconnects to a session, so it >> >> seems to me >> >> that if this was a bug in TurboVNC, someone else would have >> >> stumbled >> >> upon it by now. >> >> >> >> >> >> On 10/24/13 9:26 AM, Rafael Guimaraes wrote: >> >> > Hi DRC, >> >> > >> >> > Just some last information... I have tried to telnet Xvnc >> >> (in ports 5813 >> >> > and 5913, since I using display 13) in order to see if the >> >> Xvnc process >> >> > is completely locked or not and it seems to be responding to my >> >> > connections correclty. >> >> > >> >> > I have opened a telnet to port 5813, sent a "GET >> >> /VncViewer.jar" and was >> >> > able to download the applet. Then I opened a telnet to port >> >> 5913, >> >> > received "RFB 003.008", send "RFB 003.003" and got some >> >> binary info I >> >> > was not able to figure out... But it seems to be responding >> >> correctly, >> >> > it is not completely locked. I don't know what further tests >> >> should I do >> >> > for finding out what's happening. Since it has already >> >> happened twice, >> >> > in two different machines, I would like to know as much as I >> >> possible in >> >> > order to avoid it from happening again in the future... >> >> > >> >> > Cheers, >> >> > >> >> > Rafael >> >> > >> >> > >> >> > 2013/10/24 Rafael Guimaraes <rg...@gm... >> >> <mailto:rg...@gm...> <mailto:rg...@gm... >> >> <mailto:rg...@gm...>>> >> >> > >> >> > Hi DRC, >> >> > >> >> > I overcame my laziness and launched vncpasswd with gdb. >> >> The result >> >> > was pretty much the same of what strace and printfs have >> >> shown. The >> >> > problem really seems to be in the communication with the >> >> Xvnc server... >> >> > >> >> > Cheers, >> >> > >> >> > Rafael >> >> > >> >> > >> >> > >> >> > Reading symbols from >> >> /opt/TurboVNC/bin/vncpasswd...Reading symbols >> >> > from /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. >> >> > done. >> >> > (gdb) r -o -display :13 >> >> > Starting program: /opt/TurboVNC/bin/vncpasswd -o >> >> -display :13 >> >> > >> >> > Program received signal SIGINT, Interrupt. >> >> > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 >> >> > (gdb) bt >> >> > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 >> >> > #1 0x000000372124a9f0 in ?? () from /usr/lib64/libX11.so.6 >> >> > #2 0x000000372124ae19 in _XRead () from >> >> /usr/lib64/libX11.so.6 >> >> > #3 0x00000037212378c9 in XOpenDisplay () from >> >> /usr/lib64/libX11.so.6 >> >> > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 >> >> > #5 0x0000000000402187 in main (argc=4, >> >> argv=0x7fffffffe748) at >> >> > vncpasswd.c:293 >> >> > (gdb) >> >> > >> >> > >> >> > >> >> > 2013/10/24 Rafael Guimaraes <rg...@gm... >> >> <mailto:rg...@gm...> <mailto:rg...@gm... >> >> <mailto:rg...@gm...>>> >> >> > >> >> > Hi DRC, >> >> > >> >> > I have followed the best debugging practice there is >> >> (don't >> >> > worry, I'm being ironic :)), putting printfs around >> >> the possible >> >> > problem and I figured out that the vncpasswd hangs >> >> when calling >> >> > XOpenDisplay. When executing the code below, it >> >> prints "Opening >> >> > display :13" and nothing else. If I run it with >> >> strace, I get >> >> > the trace below. >> >> > >> >> > I didn't do the gdb test just because I am not very >> >> familiar >> >> > with doing it by gdb (I've already done it before, >> >> but there has >> >> > been centuries). If the following information is not >> >> enough and >> >> > gdb would help, I am sure I can handle it, no >> >> problem. I am just >> >> > trying to make thing easier for me! A little bit >> >> egocentric, I >> >> > know, but I just can't help it... :) >> >> > >> >> > If you check the strace output, it hangs on the last >> >> line, when >> >> > I seems to be trying to communicate through a socket >> >> > "/tmp/.X11-unix/X13". I imagine it is trying to >> >> reach Xvnc on >> >> > display 13 and it is getting no response, so it >> >> keeps waiting >> >> > forever... Am I right? What should I do if this is >> >> the case? >> >> > >> >> > Cheers, >> >> > >> >> > Rafael >> >> > >> >> > VNCPASSWD CODE WITH PRINTFS: >> >> > >> >> > int DoOTP() >> >> > { >> >> > unsigned int full; >> >> > unsigned int view = 0; >> >> > Display* dpy; >> >> > Atom prop; >> >> > int len; >> >> > char buf[MAXPWLEN + 1]; >> >> > char bytes[MAXPWLEN * 2]; >> >> > #ifdef UseDevUrandom >> >> > int fd; >> >> > #endif >> >> > >> >> > *printf("Opening display %s\n",displayname);* >> >> > if ((dpy = XOpenDisplay(displayname)) == NULL) { >> >> > fprintf(stderr, "unable to open display \"%s\"\n", >> >> > XDisplayName(displayname)); >> >> > return(1); >> >> > } >> >> > *printf("Display opened\n");* >> >> > >> >> > >> >> > STRACE: >> >> > >> >> > write(1, "Opening display :13\n", 20Opening display :13 >> >> > ) = 20 >> >> > brk(0) = 0x59f3000 >> >> > brk(0x5a14000) = 0x5a14000 >> >> > uname({sys="Linux", node="server03", ...}) = 0 >> >> > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> >> > uname({sys="Linux", node="server03", ...}) = 0 >> >> > uname({sys="Linux", node="server03", ...}) = 0 >> >> > connect(3, {sa_family=AF_FILE, >> >> path="/tmp/.X11-unix/X13"...}, >> >> > 20) = 0 >> >> > uname({sys="Linux", node="server03", ...}) = 0 >> >> > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> >> > access("/u/cnhu/.Xauthority", R_OK) = 0 >> >> > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> >> > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) = 0 >> >> > mmap(NULL, 32768, PROT_READ|PROT_WRITE, >> >> > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 >> >> > read(4, >> >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., >> >> > 32768) = 13139 >> >> > read(4, "", 32768) = 0 >> >> > close(4) = 0 >> >> > munmap(0x2b34aa97f000, 32768) = 0 >> >> > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, >> >> > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, >> >> > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> >> > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> >> > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> >> > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource >> >> > temporarily unavailable) >> >> > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> >> >> > >> >> > >> >> > >> >> > >> >> > 2013/10/24 DRC <dco...@us... >> >> <mailto:dco...@us...> >> >> > <mailto:dco...@us... >> >> <mailto:dco...@us...>>> >> >> > >> >> > Can you figure out where in the vncpasswd code >> >> it's locking >> >> > up? That >> >> > would help diagnose the issue. If you're using >> >> an RPM-based >> >> > system, you >> >> > can install the turbovnc-debuginfo RPM and get a >> >> stack trace >> >> > from gdb >> >> > when it locks up. Otherwise, you'll have to >> >> build vncpasswd >> >> > from >> >> > source, but it's straightforward. >> >> > >> >> > This is a complete shot in the dark, but I'm >> >> wondering if >> >> > maybe the >> >> > /dev/urandom device is somehow giving you >> >> problems on some >> >> > of your >> >> > machines. vncpasswd will read from /dev/urandom >> >> when it >> >> > generates an >> >> > OTP. If that is the problem, then adding >> >> > >> >> > #undef UseDevUrandom >> >> > >> >> > to the top of vncpasswd.c should temporarily >> >> work around it, >> >> > and it >> >> > would be easy to add a more permanent workaround >> >> (a command >> >> > line switch >> >> > that avoids using /dev/urandom.) >> >> > >> >> > >> >> > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: >> >> > > Hi, >> >> > > >> >> > > I have created a VNC session and after using >> >> it for sometime, it seems >> >> > > to be "locked". It is configured to use >> >> one-time-passwords and I just >> >> > > can't generate a new otp for access the >> >> session, vncpasswd hangs >> >> > > indefinitely when trying to get a new password. >> >> > > >> >> > > The session was created by the following >> >> command line: >> >> > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd >> >> > > /opt/TurboVNC/bin/../vnc/classes -auth >> >> /u/cnhu/.Xauthority >> >> > > -dontdisconnect -geometry 3192x1046 -depth 24 >> >> -rfbwait 120000 -otpauth >> >> > > -rfbport 5913 -fp >> >> > > >> >> /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 >> >> > > -co /usr/share/X11/rgb -deferupdate 1 >> >> > > >> >> > > When trying to launch a "vncpasswd -otp >> >> -display :13", I get no response >> >> > > and eventually I have to press Ctrl+C. I have >> >> runned the vncpasswd >> >> > > command line with strace and it seems to hang >> >> while trying to connect to >> >> > > Xvnc through the "/tmp/.X11-unix/X13" socket >> >> (not sure if I've got it >> >> > > right). Below you may check the end of the >> >> strace (until it hangs). >> >> > > >> >> > > What may the problem be? The same happened >> >> twice in two different >> >> > > machines. I have several other VNC sessions >> >> where this does not happen. >> >> > > Can I collect information from anywhere else >> >> that could help me trying >> >> > > to figure out what happened? >> >> > > >> >> > > uname({sys="Linux", node="server03", ...}) = 0 >> >> > > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> >> > > uname({sys="Linux", node="server03", ...}) = 0 >> >> > > uname({sys="Linux", node="server03", ...}) = 0 >> >> > > connect(3, {sa_family=AF_FILE, >> >> path="/tmp/.X11-unix/X13"...}, 20) = 0 >> >> > > uname({sys="Linux", node="server03", ...}) = 0 >> >> > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> >> > > access("/u/cnhu/.Xauthority", R_OK) = 0 >> >> > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> >> > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, >> >> ...}) = 0 >> >> > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, >> >> MAP_PRIVATE|MAP_ANONYMOUS, -1, >> >> > > 0) = 0x2ab9583b5000 >> >> > > read(4, >> >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = 13139 >> >> > > read(4, "", 32768) = 0 >> >> > > close(4) = 0 >> >> > > munmap(0x2ab9583b5000, 32768) = 0 >> >> > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, >> >> {"MIT-MAGIC-COOKIE-1", >> >> > > 18}, {"\0\0", 2}, >> >> {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> >> > > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> >> > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> >> > > read(3, 0x7fff0b98aab0, 8) = -1 >> >> EAGAIN (Resource >> >> > > temporarily unavailable) >> >> > > poll([{fd=3, events=POLLIN}], 1, -1 >> >> <unfinished ...> >> >> ------------------------------------------------------------------------------ >> October Webinars: Code for Performance >> Free Intel webinars can help you accelerate application performance. >> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from >> the latest Intel processors and coprocessors. See abstracts and register > >> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >> _______________________________________________ >> VirtualGL-Users mailing list >> Vir...@li... >> https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users |
From: Rafael G. <rg...@gm...> - 2013-10-28 18:30:47
|
Yes, I agree... I have seen several other X11 apps and they all behave the same. In the event of an X server lock, as it happened here, vncpasswd (as xwd, for example) freezed and the web portal that calls vncpasswd to generate one-time-passwords keeps waiting for vncpasswd indefinitely. For some reason, that caused firefox, as well as, chrome to become unresponsive and I was not able to open any other turbovnc session through the portal. I have tested on several computers and the same behavior was detected. The only thing that solved this was to return an error after some time without reply from XOpenDisplay (of course I am talking here of an unexpected behavior of XOpenDisplay). I can, now, show an error message for the user (unable to establish a connection to the X session). But, I understand that this is a very particular solution for, possibly, a very particular problem. Cheers, Rafael 2013/10/28 DRC <dco...@us...> > I don't like introducing code that adds complexity and masks an issue. My > understanding from you is that, when this lock-up occurs, the X server > becomes generally unresponsive, so I don't understand why working around > the issue only in vncpasswd would make any difference. Other apps would > still not work. There is nothing wrong with the way vncpasswd is written. > It is written like every other X11 app on the planet. They all expect that > XOpenDisplay() will be well-behaved, and if it isn't, then that needs to be > addressed in the X server. > > On Oct 28, 2013, at 1:06 PM, Rafael Guimaraes <rg...@gm...> wrote: > > Sorry for not being specific since the beginning. I though I had mentioned > it. Anyway, I will upgrade to the nightly release (1.2.1) and hope it > doesn't happen again. > However, don't you think it would be nice to have some kind of protection > on vncpasswd (like the patch I've implemented here and sent you) in order > to avoid that external bugs cause vncpasswd to freeze (as it happened > here)? I think it could improve vncpasswd fault-tolerance. > > Best regards, > > Rafael > > 2013/10/25 DRC <dco...@us...> > >> OK, well, why didn't you mention that before? Your original message >> implied that the lock-up either just occurred randomly or that it >> occurred as a result of running vncpasswd. Now you're saying that it >> occurs as a result of doing something specific in OpenWorks, which makes >> a lot more sense. In the process of working with Santos, we've >> uncovered a few issues like this in the past that were specific to oil & >> gas apps (those apps seem to be very good at finding bugs in X servers.) >> >> I am looking into the mouse button issue, but since that just affects >> the viewer, you should still be able to use the 1.2 server. If you need >> the 1.1 applet, then just do the following: >> >> -- Install the 1.1 RPM >> -- mv /opt/TurboVNC/vnc/classes /tmp >> -- Upgrade to the 1.2 RPM >> -- mv /opt/TurboVNC/java /opt/TurboVNC/java.bak >> -- mv /tmp/classes /opt/TurboVNC/java >> >> The 1.2 server should now serve up the 1.1 applet. >> >> I looked at the change log, and I don't see anything obvious that has >> changed in the server since 1.1 that might affect this, unless you're >> running into the 32-bit limitation of the idle timer that was fixed in >> 1.2 beta (but I doubt it-- the symptoms of that issue were different.) >> Regardless, though, a lot has changed in 1.2, so I really do recommend >> upgrading. Minimally, it ensures that we're on the same page. >> >> >> On 10/25/13 12:55 PM, Rafael Guimaraes wrote: >> > In fact, I think that it may be difficult for you to replicate the >> > problem. I am not able to do it either... It happened twice with the >> > same user and he couldn't explain me what caused the locking. He was >> > using some software from Landmark (OpenWorks), something he does on a >> > daily basis and it the TurboVNC session closed. After that the Xvnc is >> > locked as I reported. >> > There is also one issue that I though I have mentioned, but I was >> > revising my emails and I haven't... I am using TurboVNC 1.1, because of >> > the bug I previously reported (problems when 2 mouse buttons are pressed >> > simultaneously). >> > >> > Anyway, gdb told me the following: >> > >> > #0 0x000000371f6ce3b3 in __select_nocancel () from /lib64/libc.so.6 >> > #1 0x000000000045883f in CheckConnections () at connection.c:997 >> > #2 0x0000000000456ba7 in WaitForSomething (pClientsReady=0xed922c0) at >> > WaitFor.c:356 >> > #3 0x0000000000442b19 in Dispatch () at dispatch.c:259 >> > #4 0x00000000004270fd in main (argc=24, argv=0x7fff92910dd8) at >> main.c:400 >> > >> > It keeps on this select even when I run vncpasswd. >> > >> > I will try to find if some specific function of OpenWorks is causing >> > Xvnc to hang so that I can have more information about the problem. >> > For the moment, since XOpenDisplay may hang, I have implemented a hack >> > in vncpasswd code (v. 1.2, since I tried it with Xvnc 1.1) that I would >> > like to share, if you think that this may be useful in order to be more >> > fault tolerant (it waits for 10 seconds for XOpenDisplay return, >> > otherwise it exits). >> > >> > --- vncpasswd.c 2012-09-30 21:06:15.000000000 -0300 >> > +++ /tmp/vncpasswd.c 2013-10-25 15:50:28.690931098 -0200 >> > @@ -37,7 +37,8 @@ >> > #include <sys/types.h> >> > #include <unistd.h> >> > #include <errno.h> >> > #include "vncauth.h" >> > +#include <signal.h> >> > >> > #include <X11/Xlib.h> >> > #include <X11/Xatom.h> >> > @@ -64,6 +65,20 @@ >> > int otpClear; >> > char* displayname; >> > >> > +void displaytimeout(int); >> > +int displayreply; >> > +#define DISPLAYTIMEOUT 10 >> > + >> > + >> > +void displaytimeout(int signum) >> > +{ >> > + if (!displayreply) { >> > + fprintf(stderr, "unable to communicate to display \"%s\"\n", >> > + XDisplayName(displayname)); >> > + exit(1); >> > + } >> > +} >> > + >> > >> > int DoOTP() >> > { >> > @@ -78,19 +93,21 @@ >> > int fd; >> > #endif >> > >> > + signal(SIGALRM, displaytimeout); >> > + alarm(DISPLAYTIMEOUT); >> > + displayreply=0; >> > if ((dpy = XOpenDisplay(displayname)) == NULL) { >> > fprintf(stderr, "unable to open display \"%s\"\n", >> > XDisplayName(displayname)); >> > return(1); >> > } >> > - >> > + displayreply=1; >> > prop = XInternAtom(dpy, "VNC_OTP", True); >> > if (prop == None) { >> > fprintf(stderr, "The X display \"%s\" does not support VNC >> > one-time passwords\n", >> > XDisplayName(displayname)); >> > return(1); >> > } >> > - >> > if (otpClear) { >> > len = 0; >> > >> > >> > Cheers, >> > >> > Rafael >> > >> > >> > >> > 2013/10/25 DRC <dco...@us... >> > <mailto:dco...@us...>> >> > >> > Really not sure what's going on here. I am running CentOS 5.9 as >> > well, and I launched Xvnc using the exact same command line that you >> > used below. I set up a script that generates 10,000 OTPs in rapid >> > succession, then I ran 10 copies of it in parallel, and all of them >> > ran to completion with no errors or lockups. I tried again using a >> > script that creates 1000 OTPs in rapid succession, and I ran 100 >> > copies of it. This was using the latest stable build of 1.2.x >> > available at http://virtualgl.sourceforge.net/vnc.nightly/, but >> > nothing has changed since 1.2 that would have affected this, so I >> > would expect identical behavior from 1.2. >> > >> > The fact that you have been running for years with no errors really >> > seems to indicate that something has changed at the system level. >> > We have not modified the way OTPs are generated, and in fact, I >> > actually don't think that feature has been touched since it was >> > first introduced in 1.0. >> > >> > One comment I will make is that you aren't setting up the VNC >> > password correctly below. You have to pass an argument of -rfbauth >> > to Xvnc in order to set it up with a "normal" VNC password. >> > >> > Rather than an strace output, it would be more useful to know where >> > in the Xvnc code the "infinite loop" is occurring. You should be >> > able to get that info with gdb. In the past, there have been >> > several cases in which I was able to diagnose an Xvnc error by >> > simply looking at the code, without even having to reproduce it. >> > >> > >> > On 10/25/13 6:46 AM, Rafael Guimaraes wrote: >> >> Replying what you asked: >> >> >> >> - I really didn't mention my OS. It is a CentOS 5.9 64 bits. It is >> >> pretty much updated. >> >> - Yes, when the X server locks up it remains locked. I have one >> >> Xvnc locked up right now and that's where I am performing all the >> >> tests. >> >> - Sockets permissions as well as its directories are ok, I have >> >> just checked it (they are all 777, with sticky bit for /tmp and >> >> /tmp/.X11-unix and setuid for the socket file) >> >> - When I run xwd on the same session, it also freezes. And it >> >> communicates with XVnc using a regular socket (through port 6013). >> >> This means that Xvnc is not responding the poll no matter how I >> >> communicate (through a local or a regular socket) and it also >> >> means that the problem is not restricted to generating OTP >> >> passwords. Based on that, I think that disabling local sockets >> >> won't help. >> >> - I have also been running a portal based on TurboVNC/VirtualGL >> >> for a couple of years and that's the first time that something >> >> similar happens. >> >> >> >> I don't know if this happens anyway, but I have runned strace on >> >> the problematic Xvnc process and it keeps doing just the same >> >> (looping over the same system calls shown below), even when I run >> >> a vncpasswd, while I can see a different behavior, i.e. normal >> >> processing, when I telnet to port 5913. >> >> >> >> select(128, [0 1 3 4 5 6 7 9 11 13 14 17 18 19 20 21 22 24 25 26 >> >> 27], NULL, NULL, {593, 19000}) = -1 EBADF (Bad file descriptor) >> >> select(6, [5], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(7, [6], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(8, [7], NULL, NULL, {0, 0}) = 1 (in [7], left {0, 0}) >> >> select(12, [11], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(14, [13], NULL, NULL, {0, 0}) = 1 (in [13], left {0, 0}) >> >> select(15, [14], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(18, [17], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(19, [18], NULL, NULL, {0, 0}) = 1 (in [18], left {0, 0}) >> >> select(20, [19], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(21, [20], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(22, [21], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(23, [22], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(25, [24], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(26, [25], NULL, NULL, {0, 0}) = 1 (in [25], left {0, 0}) >> >> select(27, [26], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(28, [27], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(13, [3], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> select(5, [4], NULL, NULL, {0, 0}) = 0 (Timeout) >> >> >> >> - I also tried to set up a regular VNC password for the session (I >> >> did it by running "vncpasswd -d :13", is it correct? At that time, >> >> I monitored Xvnc with strace and it did nothing different from the >> >> selects shown above). Then I tried to open the VncViewer passing >> >> the password through command line. I've got the following output: >> >> >> >> Initializing... >> >> Connecting to localhost, port 5913... >> >> Connected to server >> >> Tentando conectar em localhost na porta 5555 >> >> RFB server supports protocol version 3.8 >> >> Using RFB protocol version 3.8 >> >> Enabling TightVNC protocol extensions >> >> Performing standard VNC authentication >> >> Error: The one-time password has not been set on the server >> >> java.lang.Exception: The one-time password has not been set on the >> >> server >> >> at RfbProto.readConnFailedReason(RfbProto.java:574) >> >> at RfbProto.readSecurityResult(RfbProto.java:556) >> >> at RfbProto.authenticateVNC(RfbProto.java:475) >> >> at VncViewer.connectAndAuthenticate(VncViewer.java:370) >> >> at VncViewer.run(VncViewer.java:188) >> >> at java.lang.Thread.run(Unknown Source) >> >> RFB socket closed >> >> Closing window >> >> Disconnecting >> >> >> >> - And the following strace for Xvnc >> >> >> >> accept(3, {sa_family=AF_INET, sin_port=htons(57854), >> >> sin_addr=inet_addr("127.0.0.1")}, [54348846876065808]) = 8 >> >> fcntl(8, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 >> >> setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 >> >> write(2, "\n", 1) = 1 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) >> = 0 >> >> write(2, "25/10/2013 09:38:41 ", 20) = 20 >> >> write(2, "Got connection from client 127.0"..., 37) = 37 >> >> getpeername(8, {sa_family=AF_INET, sin_port=htons(57854), >> >> sin_addr=inet_addr("127.0.0.1")}, [34359738384]) = 0 >> >> write(8, "RFB 003.008\n", 12) = 12 >> >> read(8, "RFB 003.008\n", 12) = 12 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) >> = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, "Using protocol version 3.8\n", 27) = 27 >> >> write(8, "\2\2\20", 3) = 3 >> >> read(8, "\20", 1) = 1 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) >> = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, "Enabling TightVNC protocol exten"..., 38) = 38 >> >> write(8, "\0\0\0\0", 4) = 4 >> >> write(8, "\0\0\0\1", 4) = 4 >> >> write(8, "\0\0\0\2STDVVNCAUTH_", 16) = 16 >> >> read(8, "\0\0\0\2", 4) = 4 >> >> write(8, "!\236\\U\252\30\25\356\265:\254\275$=\35s", 16) = 16 >> >> read(8, "J\221#\247]\376\274I\370C\232\314h\334;\263", 16) = 16 >> >> write(8, "\0\0\0\1\0\0\0004", 8) = 8 >> >> write(8, "The one-time password has not be"..., 52) = 52 >> >> close(8) = 0 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) >> = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, "Client 127.0.0.1 gone\n", 22) = 22 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) >> = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, "Statistics:\n", 12) = 12 >> >> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2001, ...}) >> = 0 >> >> write(2, "25/10/2013 09:38:44 ", 20) = 20 >> >> write(2, " framebuffer updates 0, rectang"..., 47) = 47 >> >> >> >> Any ideas? >> >> >> >> Best regards, >> >> >> >> Rafael >> >> >> >> >> >> 2013/10/24 DRC <dco...@us... >> >> <mailto:dco...@us...>> >> >> >> >> Since Xvnc is single-threaded, if you are able to telnet in on >> >> 59xx, >> >> that means that the server is not locked up. That leaves the >> >> following >> >> as possibilities in my mind: >> >> >> >> -- Perhaps this is due to a problem in the X11 client library. >> >> If so, >> >> then there's probably nothing I can do about it. Try updating >> >> your >> >> system to the latest O/S patches? >> >> >> >> -- Double-check permissions on /tmp, /tmp/.X11-unix, etc. >> >> They should >> >> be writable by whoever used is launching the VNC server. >> >> >> >> -- If all else fails, then perhaps you can work around it by >> >> disabling >> >> local X11 communications. You do this by starting Xvnc with >> >> '-nolisten >> >> local', which forces it to listen for X11 traffic on a TCP >> >> socket rather >> >> than a local socket, so it will not use /tmp/.X11-unix/X* in >> >> that case. >> >> >> >> Several things that I still don't understand about this >> situation: >> >> >> >> -- If you mentioned what O/S these machines are running, I >> >> missed it, so >> >> please provide that information. If it's an O/S I haven't >> >> tested, then >> >> I'm happy to try reproducing it on my end, but you implied >> >> that there >> >> was nothing different about the machines that were >> >> experiencing the bug >> >> vs. the machines that aren't (?) >> >> >> >> -- When the X server "locks up", does it remain locked? That >> >> is, do >> >> repeated attempts to run vncpasswd fail in the same way? >> >> >> >> -- Have you tried to assign a regular VNC password to the >> >> session? That >> >> would give you a way to log in and verify whether it is still >> >> properly >> >> accepting client connections once vncpasswd starts failing. >> >> >> >> I'm not saying that there's no possibility of a bug in >> >> TurboVNC, but we >> >> do have thousands of seats worldwide that are using portals >> >> built around >> >> the OTP functionality. These portals are dynamically >> >> generating OTPs >> >> every time a user launches or reconnects to a session, so it >> >> seems to me >> >> that if this was a bug in TurboVNC, someone else would have >> >> stumbled >> >> upon it by now. >> >> >> >> >> >> On 10/24/13 9:26 AM, Rafael Guimaraes wrote: >> >> > Hi DRC, >> >> > >> >> > Just some last information... I have tried to telnet Xvnc >> >> (in ports 5813 >> >> > and 5913, since I using display 13) in order to see if the >> >> Xvnc process >> >> > is completely locked or not and it seems to be responding to >> my >> >> > connections correclty. >> >> > >> >> > I have opened a telnet to port 5813, sent a "GET >> >> /VncViewer.jar" and was >> >> > able to download the applet. Then I opened a telnet to port >> >> 5913, >> >> > received "RFB 003.008", send "RFB 003.003" and got some >> >> binary info I >> >> > was not able to figure out... But it seems to be responding >> >> correctly, >> >> > it is not completely locked. I don't know what further tests >> >> should I do >> >> > for finding out what's happening. Since it has already >> >> happened twice, >> >> > in two different machines, I would like to know as much as I >> >> possible in >> >> > order to avoid it from happening again in the future... >> >> > >> >> > Cheers, >> >> > >> >> > Rafael >> >> > >> >> > >> >> > 2013/10/24 Rafael Guimaraes <rg...@gm... >> >> <mailto:rg...@gm...> <mailto:rg...@gm... >> >> <mailto:rg...@gm...>>> >> >> > >> >> > Hi DRC, >> >> > >> >> > I overcame my laziness and launched vncpasswd with gdb. >> >> The result >> >> > was pretty much the same of what strace and printfs have >> >> shown. The >> >> > problem really seems to be in the communication with the >> >> Xvnc server... >> >> > >> >> > Cheers, >> >> > >> >> > Rafael >> >> > >> >> > >> >> > >> >> > Reading symbols from >> >> /opt/TurboVNC/bin/vncpasswd...Reading symbols >> >> > from >> /usr/lib/debug/opt/TurboVNC/bin/vncpasswd.debug...done. >> >> > done. >> >> > (gdb) r -o -display :13 >> >> > Starting program: /opt/TurboVNC/bin/vncpasswd -o >> >> -display :13 >> >> > >> >> > Program received signal SIGINT, Interrupt. >> >> > 0x000000371f6cc30f in poll () from /lib64/libc.so.6 >> >> > (gdb) bt >> >> > #0 0x000000371f6cc30f in poll () from /lib64/libc.so.6 >> >> > #1 0x000000372124a9f0 in ?? () from >> /usr/lib64/libX11.so.6 >> >> > #2 0x000000372124ae19 in _XRead () from >> >> /usr/lib64/libX11.so.6 >> >> > #3 0x00000037212378c9 in XOpenDisplay () from >> >> /usr/lib64/libX11.so.6 >> >> > #4 0x000000000040187c in DoOTP () at vncpasswd.c:81 >> >> > #5 0x0000000000402187 in main (argc=4, >> >> argv=0x7fffffffe748) at >> >> > vncpasswd.c:293 >> >> > (gdb) >> >> > >> >> > >> >> > >> >> > 2013/10/24 Rafael Guimaraes <rg...@gm... >> >> <mailto:rg...@gm...> <mailto:rg...@gm... >> >> <mailto:rg...@gm...>>> >> >> > >> >> > Hi DRC, >> >> > >> >> > I have followed the best debugging practice there is >> >> (don't >> >> > worry, I'm being ironic :)), putting printfs around >> >> the possible >> >> > problem and I figured out that the vncpasswd hangs >> >> when calling >> >> > XOpenDisplay. When executing the code below, it >> >> prints "Opening >> >> > display :13" and nothing else. If I run it with >> >> strace, I get >> >> > the trace below. >> >> > >> >> > I didn't do the gdb test just because I am not very >> >> familiar >> >> > with doing it by gdb (I've already done it before, >> >> but there has >> >> > been centuries). If the following information is not >> >> enough and >> >> > gdb would help, I am sure I can handle it, no >> >> problem. I am just >> >> > trying to make thing easier for me! A little bit >> >> egocentric, I >> >> > know, but I just can't help it... :) >> >> > >> >> > If you check the strace output, it hangs on the last >> >> line, when >> >> > I seems to be trying to communicate through a socket >> >> > "/tmp/.X11-unix/X13". I imagine it is trying to >> >> reach Xvnc on >> >> > display 13 and it is getting no response, so it >> >> keeps waiting >> >> > forever... Am I right? What should I do if this is >> >> the case? >> >> > >> >> > Cheers, >> >> > >> >> > Rafael >> >> > >> >> > VNCPASSWD CODE WITH PRINTFS: >> >> > >> >> > int DoOTP() >> >> > { >> >> > unsigned int full; >> >> > unsigned int view = 0; >> >> > Display* dpy; >> >> > Atom prop; >> >> > int len; >> >> > char buf[MAXPWLEN + 1]; >> >> > char bytes[MAXPWLEN * 2]; >> >> > #ifdef UseDevUrandom >> >> > int fd; >> >> > #endif >> >> > >> >> > *printf("Opening display %s\n",displayname);* >> >> > if ((dpy = XOpenDisplay(displayname)) == NULL) { >> >> > fprintf(stderr, "unable to open display >> \"%s\"\n", >> >> > XDisplayName(displayname)); >> >> > return(1); >> >> > } >> >> > *printf("Display opened\n");* >> >> > >> >> > >> >> > STRACE: >> >> > >> >> > write(1, "Opening display :13\n", 20Opening display >> :13 >> >> > ) = 20 >> >> > brk(0) = 0x59f3000 >> >> > brk(0x5a14000) = 0x5a14000 >> >> > uname({sys="Linux", node="server03", ...}) = 0 >> >> > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> >> > uname({sys="Linux", node="server03", ...}) = 0 >> >> > uname({sys="Linux", node="server03", ...}) = 0 >> >> > connect(3, {sa_family=AF_FILE, >> >> path="/tmp/.X11-unix/X13"...}, >> >> > 20) = 0 >> >> > uname({sys="Linux", node="server03", ...}) = 0 >> >> > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> >> > access("/u/cnhu/.Xauthority", R_OK) = 0 >> >> > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> >> > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, ...}) >> = 0 >> >> > mmap(NULL, 32768, PROT_READ|PROT_WRITE, >> >> > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b34aa97f000 >> >> > read(4, >> >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., >> >> > 32768) = 13139 >> >> > read(4, "", 32768) = 0 >> >> > close(4) = 0 >> >> > munmap(0x2b34aa97f000, 32768) = 0 >> >> > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, >> >> > {"MIT-MAGIC-COOKIE-1", 18}, {"\0\0", 2}, >> >> > {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> >> > fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >> >> > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> >> > read(3, 0x7fff865304b0, 8) = -1 EAGAIN (Resource >> >> > temporarily unavailable) >> >> > poll([{fd=3, events=POLLIN}], 1, -1 <unfinished ...> >> >> > >> >> > >> >> > >> >> > >> >> > 2013/10/24 DRC <dco...@us... >> >> <mailto:dco...@us...> >> >> > <mailto:dco...@us... >> >> <mailto:dco...@us...>>> >> >> > >> >> > Can you figure out where in the vncpasswd code >> >> it's locking >> >> > up? That >> >> > would help diagnose the issue. If you're using >> >> an RPM-based >> >> > system, you >> >> > can install the turbovnc-debuginfo RPM and get a >> >> stack trace >> >> > from gdb >> >> > when it locks up. Otherwise, you'll have to >> >> build vncpasswd >> >> > from >> >> > source, but it's straightforward. >> >> > >> >> > This is a complete shot in the dark, but I'm >> >> wondering if >> >> > maybe the >> >> > /dev/urandom device is somehow giving you >> >> problems on some >> >> > of your >> >> > machines. vncpasswd will read from /dev/urandom >> >> when it >> >> > generates an >> >> > OTP. If that is the problem, then adding >> >> > >> >> > #undef UseDevUrandom >> >> > >> >> > to the top of vncpasswd.c should temporarily >> >> work around it, >> >> > and it >> >> > would be easy to add a more permanent workaround >> >> (a command >> >> > line switch >> >> > that avoids using /dev/urandom.) >> >> > >> >> > >> >> > On 10/23/13 1:36 PM, Rafael Guimaraes wrote: >> >> > > Hi, >> >> > > >> >> > > I have created a VNC session and after using >> >> it for sometime, it seems >> >> > > to be "locked". It is configured to use >> >> one-time-passwords and I just >> >> > > can't generate a new otp for access the >> >> session, vncpasswd hangs >> >> > > indefinitely when trying to get a new password. >> >> > > >> >> > > The session was created by the following >> >> command line: >> >> > > /opt/TurboVNC/bin/Xvnc :13 -desktop X -httpd >> >> > > /opt/TurboVNC/bin/../vnc/classes -auth >> >> /u/cnhu/.Xauthority >> >> > > -dontdisconnect -geometry 3192x1046 -depth 24 >> >> -rfbwait 120000 -otpauth >> >> > > -rfbport 5913 -fp >> >> > > >> >> >> /usr/share/X11/fonts/misc,/usr/share/X11/fonts/75dpi,/usr/share/X11/fonts/100dpi,/usr/share/X11/fonts/Type1,/usr/share/fonts/default/Type1 >> >> > > -co /usr/share/X11/rgb -deferupdate 1 >> >> > > >> >> > > When trying to launch a "vncpasswd -otp >> >> -display :13", I get no response >> >> > > and eventually I have to press Ctrl+C. I have >> >> runned the vncpasswd >> >> > > command line with strace and it seems to hang >> >> while trying to connect to >> >> > > Xvnc through the "/tmp/.X11-unix/X13" socket >> >> (not sure if I've got it >> >> > > right). Below you may check the end of the >> >> strace (until it hangs). >> >> > > >> >> > > What may the problem be? The same happened >> >> twice in two different >> >> > > machines. I have several other VNC sessions >> >> where this does not happen. >> >> > > Can I collect information from anywhere else >> >> that could help me trying >> >> > > to figure out what happened? >> >> > > >> >> > > uname({sys="Linux", node="server03", ...}) = 0 >> >> > > socket(PF_FILE, SOCK_STREAM, 0) = 3 >> >> > > uname({sys="Linux", node="server03", ...}) = 0 >> >> > > uname({sys="Linux", node="server03", ...}) = 0 >> >> > > connect(3, {sa_family=AF_FILE, >> >> path="/tmp/.X11-unix/X13"...}, 20) = 0 >> >> > > uname({sys="Linux", node="server03", ...}) = 0 >> >> > > fcntl(3, F_SETFD, FD_CLOEXEC) = 0 >> >> > > access("/u/cnhu/.Xauthority", R_OK) = 0 >> >> > > open("/u/cnhu/.Xauthority", O_RDONLY) = 4 >> >> > > fstat(4, {st_mode=S_IFREG|0600, st_size=13139, >> >> ...}) = 0 >> >> > > mmap(NULL, 32768, PROT_READ|PROT_WRITE, >> >> MAP_PRIVATE|MAP_ANONYMOUS, -1, >> >> > > 0) = 0x2ab9583b5000 >> >> > > read(4, >> >> "\1\0\0\10mi080113\0\00210\0\22MIT-MAGIC-COOK"..., 32768) = >> 13139 >> >> > > read(4, "", 32768) = 0 >> >> > > close(4) = 0 >> >> > > munmap(0x2ab9583b5000, 32768) = 0 >> >> > > writev(3, [{"l\0\v\0\0\0\22\0\20\0\0\0", 12}, >> >> {"MIT-MAGIC-COOKIE-1", >> >> > > 18}, {"\0\0", 2}, >> >> {"e\30\226A>\316@\17Y8\365+V\21\365\f", 16}], 4) = 48 >> >> > > fcntl(3, F_GETFL) = 0x2 (flags >> O_RDWR) >> >> > > fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >> >> > > read(3, 0x7fff0b98aab0, 8) = -1 >> >> EAGAIN (Resource >> >> > > temporarily unavailable) >> >> > > poll([{fd=3, events=POLLIN}], 1, -1 >> >> <unfinished ...> >> >> >> ------------------------------------------------------------------------------ >> October Webinars: Code for Performance >> Free Intel webinars can help you accelerate application performance. >> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >> from >> the latest Intel processors and coprocessors. See abstracts and register > >> >> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >> _______________________________________________ >> VirtualGL-Users mailing list >> Vir...@li... >> https://lists.sourceforge.net/lists/listinfo/virtualgl-users >> > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk > _______________________________________________ > VirtualGL-Users mailing list > Vir...@li... > https://lists.sourceforge.net/lists/listinfo/virtualgl-users > > |