From: cppjavaperl <cpp...@ya...> - 2008-01-25 06:20:32
|
I have a server running BackupPC 2.1.0pl1. I was having problems with a particular Windows XP client. This client had been backing up fine in the past. I had some problems with the Cygwin installation on it, so I re-installed Cygwin (moving the old version out of the way, so I would be sure to get a clean install). Since re-installing cygwin, and removing and reinstalling the rsync daemon, I had not been able to get a backup to complete. The process would stop backing up any files after a few hours, but it did not exit, and the backup was never completed. It would just show up in "Currently Running Jobs" until I killed it. I had my ClientTimeout set to 72000 and I'm sure I always killed it before that much time had passed. :-) I am using the latest version of rsync (from the BackupPC download area on SourceForge). I decided to run a manual rsync on my BackupPC server and see what happened. I was under the impression at the time that the problem was with something in my C:\Windows\system32 folder, so I tried to run rsync and just sync that folder. I was surprised to find that running the following command failed: rsync -avvvv backup@mymachine::C/windows/system32 . Here is the error I got: rsync: pop_dir "/cygdrive/c/WINDOWS/system32/c:" (in C) failed: No such file or directory (2) I found some information on this problem in this thread on devshed.com: http://archives.devshed.com/forums/networking-100/no-such-file-or-directory-but-hey-the-files-are-there-2113763.html (Quoting Matt McCutchen): I see what is going on here. In order to scan under a source argument for files to add to the file list, the sender needs to be in the directory that maps to the root of the transfer. In the first command, the root of the transfer is the module, and the sender is already there. But in the second, the root of the transfer is the "backup" subdirectory, so the sender needs to push_dir into "backup", scan, and pop_dir back to the module. As the sender is starting up, it pushes into "u:" by calling chdir("u:"). Cygwin understands "u:" as an absolute path for the U drive, so this works fine. But the sender assumes "u:" is relative because it doesn't begin with a slash, so the sender thinks it has entered "/cygdrive/c/WINDWS/system32/u:". When the sender tries to pop back into this directory later, the error occurs. I was kind of surpised at this, since I have several Windows boxes running an rsync daemon, and I have always configured the module for the C: drive with a path of "C:" (which is the example given in the BackupPC rsync package on SourceForge). So, I changed the path to /cygdrive/c, restarted rsyncd, and retested the manual rsync -avvvv ... command I had tried earlier. This worked fine, so I decide to run "BackupPC_dump -v -i mymachine" manually so I could see what was happening with the backup. Things went along swimmingly for a few hours, and then suddenly BackupPC_dump segfaulted! No error messages, no nothing, just SEGMENTATION FAULT. I searched the mailing list, but couldn't seem to find a match for the problem. The only post that seemed to match was one where Craig had said you could get a segfault if the client sent two files of the exact same path and name. I didn't see how that could be the case so I kept looking. By looking through the Xferlog and directories of the prematurely aborted backup, I found that the backup was aborting in the directory which contained my original Cygwin install (I had renamed the directory to move it out of the way, so that I could grab files from it later). I had made multiple attempts to recover my backed up cygwin configuration. Presently I don't have broadband available, so it wasn't as simple as re-downloading all of the Cygwin packages I wanted. I had a repository of packages grabbed mostly from the same mirror, but there were a few packages from other mirrors, along with some I had previously burned to CD from work. I think the real problem, however, stemmed from me trying to restore my Cygwin directory from my backuppc server. I *did* have problems installing Cygwin again, if I picked anything other than a "Default" install. So, after getting a successful (no errors) base install of Cygwin, I tried to restore from a BackupPC-generated tar file (over the top of a freshly-installed) Cygwin. That hadn't worked, and so I gave up, moved this test cygwin install directory out of the way, re-installed Cygwin with the "Default" install and moved on. This post is already far too long, so I'll get to the point. Craig was right. :-) The problem was that I had programs in my c:/cygwin directory that existed as *both* regular empty files and as Windows shortcuts (.lnk files). The first I found was /bin/aclocal, then /bin/autoconf, /bin/automake. For each of these files there was a zero byte file with the name you were expecting (like /bin/autoconf -- no extension). *But* there was also a c:/cygwin/bin/autoconf.lnk which pointed to another file. Problem was, under rsync, these were both transmitted with the filename /cygdrive/c/cygwin/bin/autoconf. That was causing the segfault. I changed my rsyncd.conf file to point to the old cygwin/bin directory which had the "duplicate" files in it, restarted rsyncd, and re-ran my BackupPC_dump command. *BINGO* -- almost immediately it segfaulted on the first of those files it found. To fix this problem, I wrote a small perl program to search directories for files which occurred both with and without a .lnk extension. The search directories were given as command line arguments to the program. I ran this program under ActiveState perl, because cygwin's perl would interpret the filenames and not give me what I wanted. Here is the program: use File::Find; find(\&checkForLnkFile, @ARGV); sub checkForLnkFile { return if ( -d $_ ); return if (lc(substr($_, -4)) eq '.lnk'); print $File::Find::dir."/$_\n" if ( -f $_.'.lnk' ); } This program printed me out a list of the offending files (without the .lnk) extension. To test, I removed all of the .lnk files from the directory I had last run BackupPC_dump against, and reran BackupPC_dump. *BINGO* again! This time the backup completed with no problems. So, I removed the rest of the duplicate files, reset the rsyncd.conf to backup the whole drive again, and am presently running BackupPC_dump again (which I fully expect to complete without a hitch). After spending all this time figuring this out, I decided to share it here in case someone else is having the problem. Also, I was wondering if perhaps it would be a good idea to guard against this problem with some kind of error from BackupPC. I know my server is running an older version, so perhaps this has been addressed in one of the 3.x versions. But I didn't find anything about it on the mailing list (other than Craig's comment that led me to the solution), so I suspect that even the 3.x versions are susceptable to this. Hope this helps someone out, cppjavaperl ____________________________________________________________________________________ Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ |