From: Les M. <les...@gm...> - 2012-11-15 18:20:25
|
On Thu, Nov 15, 2012 at 11:22 AM, Markus <uni...@tr...> wrote: > > The client is a quad core 2.8 GHz CPU, 8 GB RAM and 1.6 TB of many many > small files in a RAID0. CPUs 75-95% idle most of the time, load around > 0.3. No swap used. > > rsync 3.0.7 on the client, rsync 3.0.6 on the server. Negotiated > protocol 28 (hmm). Backuppc uses its own rsync protocol implementation on the server side so it can work with the compressed/pooled archive and protocol 28 is the latest it knows. This is unfortunate in the 'huge number of files' case because it has to transfer entire file list before starting the comparisons. > BackupPC server is backing up to a 8 TB iSCSI RAID5 drive. Raid5 has performance issues with writes, but you haven't gotten that far yet. > So, when I initiate the full backup manually, and then attach with > strace to the rsync process on the BackupPC server I see activity, but > it's so slow. Roughly it takes about 5 seconds for 10 lines of > read/select/write. On the client I can see via lsof that rsync is going > through the filesystem and the many mini-files. On the client rsync > consumes like 0.x-1% CPU and the "virt" mem size is growing slowly (80M > after 10 minutes, in the beginning 40M; that's what "top" says). > > Load on the client goes from 0.3 to about 1.8 a few minutes after I > start the full backup. But CPUs stay at around 70-90% idle. That's mostly normal - the client has to walk the entire directory tree and send it to the server. But it should happen at a reasonable speed. > I'm guessing the client is slowing down after so many hours because > rsync has used up all memory as it is caching the list of files to be > transferred? Yes, both the client and server will load the directory in memory. > What I don't understand yet is - why does rsync on the client tell rsync > on the server about the files it is currently going through? I mean, it > is not even transferring them yet! The server gets the client's directory list, then walks it comparing to the existing data, telling the client to send any differences. >I see these select/read/write outputs > in strace attached to the rsync server process, and as it appears this > means "Hello, I'm the client, and I'm telling you which files I go > through now on the client filesystem". But why does the rsync server > process even need to know about that at this point in time? > > I'm guessing it can't be the comparing mechanism of BackupPC, because > there are no files to compare yet! There are 0 backups with 0 files. That's just the way the rsync protocol works - or did up until protocol 30. > Any suggestions on what I could do or what could go wrong here? You are probably pushing the client or server into swap which will slow things down horribly. If there are top-level directories segregating the files sensibly you could split it into multiple 'shares'. Otherwise, you could switch the xfer method to tar. Also, I would try something like 'time find / |wc -l' on the target system just to see how long it takes to walk the directory and how many files are there. -- Les Mikesell les...@gm... |