From: <cba...@us...> - 2003-10-27 17:53:47
|
Rick Clark writes: > # dd if=/dev/hda6 bs=1024 count=4100000 | bzip2 > hda6.bz2 > > appears to work, creating a file about half a gig in size (about 2 gig of the FS had been in use). But being the paranoid sort, I tried to uncompress it into another partition. > > # bunzip2 < hda6.bz2 | dd bs=1024 of=/dev/hdb8 > > Chugs for a while then fails with a complaint from bunzip2 about a CRC error. > > # bzip2 -tvv < hda6.bz2 > > reveals the encryption block it fails at, quite a ways into the file. > I've repeated the compression test several times with the same source file, > and the decompression failure always occurs at a different compression block. > > I tried gzip once, with similar results! I'm very surprised by this, and naturally a little skeptical. Perhaps you have some flakey hardware? Are the file systems you are using unmounted when you do the test? What happens when you do this instead: # dd if=/dev/hda6 bs=1024 count=4100000 > hda6.raw # bzip2 < hda6.raw | bzip2 -tvv I'm not near a linux machine (in fact I'm about 6,000 miles away from mine), but I can try this too when I get home. Craig |
From: Rick C. <rnp...@ch...> - 2003-10-28 03:29:15
|
Craig - The partition is not mounted. Hasn't been since using it to create the new LVM partitions. I have tried this with the intermediate file on both hard drives, so if it flaky hardware, it is at the motherboard, memory, or processor level. I didn't think I could do the test you specify, because that would make a file of nearly 4 gig, which I thought was not supported under ext3. Turns out it did. Things got _really_ strange. > >What happens when you do this instead: > > # dd if=/dev/hda6 bs=1024 count=4100000 > hda6.raw > # bzip2 < hda6.raw | bzip2 -tvv What happens is it failed at the very first compression block. Three times in a row. BUT I then used dd to copy off the first 20 megabytes of the big file, and the test passed on that small file just fine. Then I went back and tested the big file again, and guess what? It is chugging away merrily past compression block #1,296. So, either I've lost my mind, I've become stone cold stupid, or this particular machine or kernel has a problem with its file cache algorithm. Again, that is a slackware 9.0 distribution of kernel 2.4.20 with some ext3 patches, lvm utilities version 1.0.6 (IOP 10), compiled for Athlon. Lovely. Now what do I do? Is there a newer stable kernel I should upgrade to? Fortunately, I'm guessing my customer does not have this problem on his older RedHat. - Rick Clark (Open to suggestions)... At 08:36 AM 10/27/03 -0800, cba...@us... wrote: >Rick Clark writes: > >> # dd if=/dev/hda6 bs=1024 count=4100000 | bzip2 > hda6.bz2 >> >> appears to work, creating a file about half a gig in size (about 2 gig of the FS had been in use). But being the paranoid sort, I tried to uncompress it into another partition. >> >> # bunzip2 < hda6.bz2 | dd bs=1024 of=/dev/hdb8 >> >> Chugs for a while then fails with a complaint from bunzip2 about a CRC error. >I'm very surprised by this, and naturally a little skeptical. >Perhaps you have some flakey hardware? > >Are the file systems you are using unmounted when you do the test? > >What happens when you do this instead: > > # dd if=/dev/hda6 bs=1024 count=4100000 > hda6.raw > # bzip2 < hda6.raw | bzip2 -tvv > >I'm not near a linux machine (in fact I'm about 6,000 miles away from >mine), but I can try this too when I get home. > >Craig > |
From: Daniel P. <da...@ri...> - 2003-10-28 03:49:17
|
On Mon, 27 Oct 2003, Rick Clark wrote: [...] > I didn't think I could do the test you specify, because that would make a file > of nearly 4 gig, which I thought was not supported under ext3. Turns out it > did. Sure it does. Has forever, as you found. The *tools* might have problems, though, if they don't have large file support built in. > Things got _really_ strange. >> >>What happens when you do this instead: >> >> # dd if=/dev/hda6 bs=1024 count=4100000 > hda6.raw # bzip2 < hda6.raw | >> bzip2 -tvv > > What happens is it failed at the very first compression block. Three times in > a row. BUT I then used dd to copy off the first 20 megabytes of the big file, > and the test passed on that small file just fine. Then I went back and tested > the big file again, and guess what? It is chugging away merrily past > compression block #1,296. Had you finished creating the image there, or started the test as it was working, by chance? > So, either I've lost my mind, I've become stone cold stupid, or this > particular machine or kernel has a problem with its file cache > algorithm. ...or your 'dd' or shell are wrapping at the 2 or 4GB mark as a result of some nasty largefile support bug. :) Daniel -- We live in a hallucination of our own devising. -- Alan Kay |
From: Rick C. <rnp...@ch...> - 2003-10-28 04:43:18
|
>On Mon, 27 Oct 2003, Rick Clark wrote: >> Things got _really_ strange. >>> >>>What happens when you do this instead: >>> >>> # dd if=/dev/hda6 bs=1024 count=4100000 > hda6.raw >>> # bzip2 < hda6.raw | bzip2 -tvv >> >> What happens is it failed at the very first compression block. Three times in >> a row. BUT I then used dd to copy off the first 20 megabytes of the big file, >> and the test passed on that small file just fine. Then I went back and tested >> the big file again, and guess what? It is chugging away merrily past >> compression block #1,296. At 02:46 PM 10/28/03 +1100, Daniel Pittman wrote: >Had you finished creating the image there, or started the test as it was >working, by chance? The image was finished. At least, the prompt came back. Sync may not have completed. But surely it had by my third try at compress to decompression test through pipe. >> So, either I've lost my mind, I've become stone cold stupid, or this >> particular machine or kernel has a problem with its file cache >> algorithm. > >...or your 'dd' or shell are wrapping at the 2 or 4GB mark as a result >of some nasty largefile support bug. :) The dd was only involved with the initial creation of the ~4 gig file. The strangeness happened with the bzip2 tool (and once earlier with the gzip tool, but that also involved dd pipes). I can't see it being a bzip2 problem with large files, because since I'm using I/O redirection it doesn't even know what the size of the file is ahead of time. It was failing in the first few meg of the file until I did some essentially unrelated file system I/O, then it worked fine. Still thinking a few people in this group should test this on their systems, so we know if it is a danger to backuppc, and whether it is reproducible enough to send to ... who? Linus and company? Slackware doesn't use Bugzilla. - Rick Clark |
From: Rick C. <rnp...@ch...> - 2003-10-29 15:34:33
|
I think backuppc is safe from whatever this bzip2 and gzip problem is. It seems to only occur when a pipe is involved, and the left side of the pipe (dd) fills faster than the right side (bzip2) empties. I thought perhaps it was a bzip2 bug with handling short reads, so I looked up the source and found that it is reading in chunks of 5,000 UChars. So I tried: dd if=/dev/hdb6 ibs=1024 obs=5000 count=4100000 | bzip2 > hdb6.bz2 But that still failed during uncompress. So then I figured a problem with the pipe subsystem and stdio itself when the pipe overflows. So I tried: nice -5 dd if=/dev/hdb6 bs=1024 count=4100000 | bzip2 > hdb6.bz2 Decompression check of this worked! So keeping the left side of the pipe starved by nicing it down worked. Again, who do I notify of this problem? Can someone try this on the latest Redhat please? They seem to be the current keepers of the bzip2 code and have bugzilla to report it through. - Rick Clark At 11:33 PM 10/27/03 -0500, Rick Clark wrote: >>On Mon, 27 Oct 2003, Rick Clark wrote: >>> Things got _really_ strange. >>>> >>>>What happens when you do this instead: >>>> >>>> # dd if=/dev/hda6 bs=1024 count=4100000 > hda6.raw >>>> # bzip2 < hda6.raw | bzip2 -tvv >>> >>> What happens is it failed at the very first compression block. Three times in >>> a row. BUT I then used dd to copy off the first 20 megabytes of the big file, >>> and the test passed on that small file just fine. Then I went back and tested >>> the big file again, and guess what? It is chugging away merrily past >>> compression block #1,296. > >At 02:46 PM 10/28/03 +1100, Daniel Pittman wrote: >>Had you finished creating the image there, or started the test as it was >>working, by chance? > >The image was finished. At least, the prompt came back. Sync may not have completed. But surely it had by my third try at compress to decompression test through pipe. > >>> So, either I've lost my mind, I've become stone cold stupid, or this >>> particular machine or kernel has a problem with its file cache >>> algorithm. >> >>...or your 'dd' or shell are wrapping at the 2 or 4GB mark as a result >>of some nasty largefile support bug. :) > >The dd was only involved with the initial creation of the ~4 gig file. The strangeness happened with the bzip2 tool (and once earlier with the gzip tool, but that also involved dd pipes). I can't see it being a bzip2 problem with large files, because since I'm using I/O redirection it doesn't even know what the size of the file is ahead of time. It was failing in the first few meg of the file until I did some essentially unrelated file system I/O, then it worked fine. > >Still thinking a few people in this group should test this on their systems, so we know if it is a danger to backuppc, and whether it is reproducible enough to send to ... who? Linus and company? Slackware doesn't use Bugzilla. > >- Rick Clark > > > >------------------------------------------------------- >This SF.net email is sponsored by: The SF.net Donation Program. >Do you like what SourceForge.net is doing for the Open >Source Community? Make a contribution, and help us add new >features and functionality. Click here: http://sourceforge.net/donate/ >_______________________________________________ >BackupPC-users mailing list >Bac...@li... >https://lists.sourceforge.net/lists/listinfo/backuppc-users >http://backuppc.sourceforge.net/ > > |
From: Rick C. <rnp...@ch...> - 2003-11-19 02:32:35
|
You may remember that a few weeks ago I was struggling over concern with bzip and bzip2 possibly being broken (or perhaps linux 2.4.20), especially when being used on the right hand side of a pipe. I've finally found my problem, and it was not with Linux or bzip2. I had a bad memory card. At least, I ran the problem script successfully twice in a row with one card and not with the other. Shame on me for not using ECC memory. Apparently bzip and bzip2 really combs the memory with all of its CRC checking. The pipe probably caused additional memory copies and forced it being flushed from cache more often, making the problem even more noticeable. So backuppc is safe from this problem. But there is a lesson to be learned. When using a data critical application like backuppc, use ECC memory in the server! - Rick Clark |