I thought it may be a nice idea to put the big and zero-ful
SQL databases on e2compr filesystem. Wrong. When
mysqld starts up and reads the "compressed" files the
system
experiences all kinds of crashing behaviour: kernel
page fail with
register dump, processes die / cant be started, etc.
etc. this
happens reproducibly on mysqld startup. Several fsck runs
(read: reboots! mysqld got started each time!)
fix the problem, complaining about various files. When
everything
is up again "ibdata1" (main data file) is flagged with "E".
This is kernel 2.4.26 with e2compr on a Celeron 2.4GHz.
I think
the "offending" files are stock mysql distribution (I
had no chance
to cat in the 1 GB-wikipedia dump as I had planned) but
if not
I can provide them. It may also be related to uncommon
access
method, maybe sparsing write or mmap() or whatever.
MySQL is 4.0.18-1mdk.
e-mail me under bengaal@freenet.de
Logged In: NO
PS. The method was gzip-9 (I didnt try with any other).
Logged In: NO
Are you using serial ATA?
Logged In: NO
No, no serial ATA. standard cheapo laptop, Mandrake Linux,
ext2 partition on IDE drive. The ext2 partition is quite
huge though
with many inodes too. about the whole partition (and another
too)
is e2compressed, even some OS data (/usr/share/whatever)..
works fine.
Logged In: NO
I tried again, this time it didn't hose my system (yet), but
did not work
either. Here is what I did:
#rpm -e MySQL
#urpmi MySQL
[... this is MySQL from Mandrake 10, CD 3, data files
installed over a symbolic link from /var/lib/mysql
to the compressed FS]
the installed MySQL is MySQL-4.0.18-1mdk
startup of MySQL yielded the interesting kernel message
Sep 15 10:48:31 localhost kernel: EXT2-fs warning (device
ide0(3,8)): ext2_decompress_pages: bad magic number: inode =
117006, magic = 0x00
Sure enough, inode 117006 is mysql's ibdata1. The funny
thing is,
lsattr doesn't show any file in that dir as "E"... yet (it
will probably
crash and burn on next boot however)
Don't know if that information helps,but have it nevertheless.
Logged In: NO
Again me,please don't feel pushed or anything, I am just
reporting
more data, it may help or not.
mysql put on uncompressed FS, wastes gigabytes of space now due
to lack of compression (it's a RO database mainly and I
thought I'd
take the performance hit versus literally gigabytes of
zeroes). Works
fine. The final database directory cp-R to compressed and
mysql pukes. I will maybe try to read-only ibdata1 and
uncompress the other
files, and report the results.
A similar system crash happens when I compile wikimedia
software,
math subdirectory, on a e2fs with compression on. somefile.o
seems
to be the "culprit". attempts to isolate offending
bitstrings proved
useless. It seems to be a particular access method that makes
e2fs, or kernel+e2fs, or maybe kernel+e2fs+Whatever, go mad,
trash kernel structures and obliterate system stability. I
really
have no clue about the inner workings of e2fs, but it couldbe
the "hole" problem. in unix fseek to wherever-far-away-position
and writing some bytes is legal. but then again, maybe that'snot
the problem,maybe mmap() or god-knows-what. when used with
cp or cp -R e2fs never does such things. I have hundreds of
thousands
of files on this e2fs partition and they just don't do such
things. They
are files of all kinds, including difficult cases like PNG
or JPG.
Logged In: NO
Again me with more information. Bug still there after upgrading
to 2.4.26+latest e2compr for 26+patch2.4.27+patch2.4.28.
Same crash look and feel, so probably not a "heisenbug".
Additionally, when installing the nvidia driver kernel message
localhost kernel: compress.c:2754: <NULL>: Assertion
`ROUNDUP_RSHIFT(inode->i_size,
inode->i_sb->s_blocksize_bits) >= s_nblk' failed.
appears (no crash, and seems to work).
Also found messages:
Dec 3 20:38:02 localhost kernel: EXT2-fs warning (device
ide0(3,8)): ext2_decompress_pages: bad magic number: inode =
852394, magic = 0x1835
Dec 3 20:38:02 localhost kernel: compress.c:2465: <NULL>:
Assertion `tmp != 0xffffffff' failed.
Those seem to be related to the crash (make in wikipedia/math)
The tmp!= assertion didn't occur earlier.
The blocksize on both compressed FS is 1K.
/dev/hda8: 216849/1003520 files (17.3% non-contiguous),
3838671/4008184 blocks
/dev/hda5: 117707/132096 files (11.2% non-contiguous),
878414/1052224 blocks
kernel compiled with -O2 and -march=686 (standard)
gcc version 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk)
Pentium IV Celeron, 2.4GHz
best regards
Logged In: YES
user_id=481480
Ville Herva recently alerted me to the fact that e2compr is
not properly LFS-aware (since it was based on Linux 2.2
code, and at that time the kernel did not have Large File
System support). There are several instances of off_t that
need to be loff_t, to begin with. If your database extents
are >2 Gb uncompressed, this may be a large part of your
problem.
Additionally, as you say, mmap support has not been very
strenuously tested.
Logged In: NO
This may be it - but the "compile software on e2compressed
." test didn't involve files over 2G. Good luck finding
bugs. Best regards
Logged In: NO
me again :-)
Upgraded to 2.4.31, avoided bug-triggering behaviour, but
had a hard crash today. e2compr is nice to manage shovels
and shovels of read-only bloat, like texts or locale junk.
It's especially nice on old boxes were hd space is really
tight. However, it is not really ready for production use.
Back to the bug. The log says:
Oct 6 15:18:27 localhost kernel: EXT2-fs warning (device
ide0(3,8)): ext2_decompress_pages: bad magic number: inode =
760316, magic = 0x1801
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: balloc.c:284: <NULL>:
Assertion `(block != 0xffffffff) || !(((inode->i_mode) &
00170000) == 0100000) ||
!(sb->u.ext2_sb.s_es->s_feature_incompat &
((__u32)(0x0001)))' failed.
Oct 6 15:18:27 localhost kernel: EXT2-fs error (device
ide0(3,8)): ext2_free_blocks: Freeing blocks not in datazone
- block = 4294967295, count = 1
Oct 6 15:18:27 localhost kernel: EXT2-fs warning (device
ide0(3,8)): ext2_decompress_pages: bad magic number: inode =
760316, magic = 0x1801
followed by CRASH AND BURN messages (nothing really works. X
dies with a kernel stack trace thrice per second. login
complains about duping fds nono etc. system hosed)
Now at the time I had at least 2 concurrent wget tasks
churning away onto the filesystem. wget usually does not
trigger the bug, like gcc or mysql do. I maybe thought
e2compr may have issues with concurrent access, are you sure
the code is safe?
On the other hand, the offending i-node shows up as a small
(< 100K) innocent html file, lsattr does not show the E flag
(it says cB 16 gzip9). The best thing is is that the date is
set to -- yesterday. I don't really remember wgetting on
this particular target yesterday, and certainly no crash.
Actually for the first time I ran stat(1). I don't claim to
understand those attributes, really:
Access: 2005-10-06 16:05:25.000000000 +0200
Modify: 2005-10-05 20:31:45.000000000 +0200
Change: 2005-10-06 15:18:28.000000000 +0200
'change' seems to be the crash time. But those were wget
-nc? *shrug*
Reading the file itself produces
Oct 6 16:01:13 localhost kernel: EXT2-fs warning (device
ide0(3,8)): illegal method id: inode = 760316, id = 32
and lots more of the same messages. What can be read of it
is some html and binary junk inside it. Interestingly, the
contents and the junk vary by access method! Reading with mc
F3 yields other junk position than with cp or with Shift-F3
(which bus errors actually). When reading with cp the junk
is somewhere unaligned, while other access methods seem to
align the junk on a nice 0x400 (at least) boundary.
Further experiments were thwarted by the fact that it says
now "Input/output error" to the same command (cat |) that
worked earlier. The E flag didn't get set though, only when
I ran lsattr -lruv it complained heavily about bad cluster 2
and set the E flag.
Ok, some more info about the kernel config:
Intel P4 Celeron (i686)
afaics compiled with (from Makefile)
CFLAGS := $(CPPFLAGS) -Wall -march=i686 -mcpu=pentium4
-fexpensive-optimizations -Wstrict-prototypes -Wno-trigraphs
-O2 \ -fno-strict-aliasing -fno-common
ifndef CONFIG_FRAME_POINTER
CFLAGS += -fomit-frame-pointer
endif
and with gcc (I also have gcc 4.x but it doesn't work on the
kernel anyway, so it can't be compiled with this one)
Reading specs from
/usr/lib/gcc-lib/i586-mandrake-linux-gnu/3.3.2/specs
Configured with: ../configure --prefix=/usr
--libdir=/usr/lib --with-slibdir=/lib
--mandir=/usr/share/man --infodir=/usr/share/info
--enable-shared --enable-threads=posix --disable-checking
--enable-long-long --enable-__cxa_atexit
--enable-clocale=gnu
--enable-languages=c,c++,ada,f77,objc,java,pascal
--host=i586-mandrake-linux-gnu --with-system-zlib
Thread model: posix
gcc version 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk)
kernel options pertaining to ext2 (grep):
(don't ask me to fiddle with them, please, I don't bake
kernels daily)
CONFIG_EXT2_FS=y
CONFIG_EXT2_COMPRESS=y
CONFIG_EXT2_HAVE_LZO=m
CONFIG_EXT2_HAVE_LZV1=m
CONFIG_EXT2_HAVE_LZRW3A=m
CONFIG_EXT2_HAVE_GZIP=y
CONFIG_EXT2_HAVE_BZIP2=m
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_DEFER is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_LZO is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_LZV1 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_LZRW3A is not set
CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP=y
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_BZIP2 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP1 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP2 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP3 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP4 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP5 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP6 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP7 is not set
# CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP8 is not set
CONFIG_EXT2_DEFAULT_COMPR_METHOD_GZIP9=y
# CONFIG_EXT2_DEFAULT_CLUSTER_BITS_2 is not set
# CONFIG_EXT2_DEFAULT_CLUSTER_BITS_3 is not set
CONFIG_EXT2_DEFAULT_CLUSTER_BITS_4=y
# CONFIG_EXT2_DEFAULT_CLUSTER_BITS_5 is not set
CONFIG_EXT2_COMPR_X86_CODE=y
CONFIG_EXT2_SEPARATE_WORK_AREAS=y
CONFIG_EXT2_VERIFY_COMPRESSION=y
Hardware: I thought of maybe a flipped bit, but this
hardware, cheap as it is, doesn't tend to produce sig11 or
mystery crashes. At least, as I could trigger the problem
reproducibly with gcc or mysql I don't tend to assume bit rot.
best regards
Logged In: YES
user_id=1214484
Originator: NO
Could be also the so called bdev-bug. This concerns databases in special,
because the kins of app. use seek-operations on files and could therefore
create file "holes". e.g. the rpm databases are concerned, too.
bedev bug was fixed in 2.6.22.5 patch.
Could be reproduced with the following script.
Therefore test with different seek=n numbers.
#!/bin/bash
echo "ABC" > hole.org
echo -n "Z" | dd of=hole.org bs=1024 seek=9
echo "ABC" > hole
chattr +c hole
echo -n "Z" | dd of=hole bs=1024 seek=9
# should not result in a diff, if everything works correctly
diff hole hole.org
Fixed in patch 0.4.51 in 2.6. series. To be tested for 2.4.x.
9 Sep 2007
Matthias Winkler <m.winkler@unicon-ka.de>
* fixed bdev-bug. this bug appeared primarily when
files contained holes. A page with holes, which
was dirty caused ext2_get_cluster_blocks [ext2_get_block()]
to create ALL blocks of the page, even if there were holes!
These allocated hole-blocks weren't set to 0 anywhere and
therefore contained invalid data. I changed the
code to never allocate these holes.